Record t1.16.1 task evidence

This commit is contained in:
ben
2026-03-20 13:32:27 -04:00
parent f93b9aa464
commit 2847d2d59f

View File

@@ -624,7 +624,7 @@ tighten Costco-specific normalization so normalized item names are cleaner and d
- The structured parsing still owns size/pack extraction, so name cleanup can safely strip dual-unit and logistics fragments after those fields are parsed.
- Discount-line behavior remains unchanged; this task only cleaned normalized names and preserved the existing audit trail.
* [x] t1.15: refactor review/combine pipeline around normalized_item_id and catalog links (4-8 commits)
* [X] t1.15: refactor review/combine pipeline around normalized_item_id and catalog links (4-8 commits)
replace the old observed/canonical workflow with a review-first pipeline that uses normalized_item_id as the retailer-level review unit and links it to catalog items
** Acceptance Criteria
@@ -677,7 +677,7 @@ replace the old observed/canonical workflow with a review-first pipeline that us
- Existing auto-generated catalog rows are no longer carried forward by default; only deliberate catalog entries survive. That keeps the new `catalog.csv` conservative, but it also means prior observed-based auto-links do not migrate into the new model.
- Live rerun after the refactor produced `627` purchase rows, `387` review-queue rows, `407` distinct normalized items, `0` linked normalized items, and `0` unresolved rows missing from the review queue.
* [x] t1.16: cleanup review process and format
* [X] t1.16: cleanup review process and format
** acceptance criteria
1. Add intro text explaining:
@@ -718,8 +718,50 @@ replace the old observed/canonical workflow with a review-first pipeline that us
- Direct numeric selection works well for suggestion-heavy review, while `[l]ink existing` remains available as a fallback when the suggestion list is empty or incomplete.
- I kept the review data model unchanged from `t1.15`; this task only tightened the prompt format, field order, and save behavior.
* [x] t1.16.1: add catalog search flow to review ui (2-3 commits)
enable fast lookup of catalog items during review via tokenized search and replace manual list scanning
** acceptance criteria
1. replace `[l]ink existing` with `[s]earch` in review prompt:
- `[#] link to suggestion [s]earch [n]ew [x]exclude [q]uit >`
2. implement search flow:
- on `s`, prompt: `search: `
- tokenize input using same normalization rules as suggestion matching
- return ranked list of catalog items where tokens overlap with:
- catalog_name
- product_type
- variant
- display results in same numbered format as suggestions:
[1] flour, flour, baking (12 items, 48 rows)
3. allow direct selection from search results:
- when user inputs number, immediately creates approved resolution and product_links rows
- returns to next review item
4. reuse match logic used for suggestion matching; no new matching system introduced
- future improvements to matching logic will therefore apply in both places
5. search results exclude already-linked current normalized_item_id target
6. fallback behavior:
- if no results, print `no matches found`
- allow retry or return to main prompt
7. keep interaction tight:
- no full catalog dump
- max ~10 results returned
- sorted by simple score (token overlap count)
8. persistence:
- selected link writes immediately to `product_links.csv`
- no buffering until script end
- pm note: optimize for speed over correctness; this is a manual assist tool, not a ranking system
- pm note: improve manual lookup flow only, don't retool or create a second algorithm
** evidence
- commit: `f93b9aa`
- tests: `./venv/bin/python -m unittest discover -s tests`; `./venv/bin/python review_products.py --help`; `./venv/bin/python review_products.py --refresh-only`
- datetime: 2026-03-20 13:32:09 EDT
** notes
- The search path reuses the same lightweight token matching rules as suggestion ranking, so there is still only one matching system to maintain.
- Direct numeric suggestion-pick remains the fastest happy path; search is the fallback when suggestions are sparse or missing.
- Search intentionally optimizes for manual speed rather than smart ranking: simple token overlap, max 10 rows, and immediate persistence on selection.
#+END_*
* [ ] 1t.10: add optional llm-assisted suggestion workflow for unresolved normalized retailer items (2-4 commits)