Record t1.16.1 task evidence

2026-03-20 13:32:27 -04:00
parent f93b9aa464
commit 2847d2d59f
1 changed files with 45 additions and 3 deletions
--- a/pm/tasks.org
+++ b/pm/tasks.org
@@ -624,7 +624,7 @@ tighten Costco-specific normalization so normalized item names are cleaner and d
 - The structured parsing still owns size/pack extraction, so name cleanup can safely strip dual-unit and logistics fragments after those fields are parsed.
 - Discount-line behavior remains unchanged; this task only cleaned normalized names and preserved the existing audit trail.

-* [x] t1.15: refactor review/combine pipeline around normalized_item_id and catalog links (4-8 commits)
+* [X] t1.15: refactor review/combine pipeline around normalized_item_id and catalog links (4-8 commits)
 replace the old observed/canonical workflow with a review-first pipeline that uses normalized_item_id as the retailer-level review unit and links it to catalog items

 ** Acceptance Criteria
@@ -677,7 +677,7 @@ replace the old observed/canonical workflow with a review-first pipeline that us
 - Existing auto-generated catalog rows are no longer carried forward by default; only deliberate catalog entries survive. That keeps the new `catalog.csv` conservative, but it also means prior observed-based auto-links do not migrate into the new model.
 - Live rerun after the refactor produced `627` purchase rows, `387` review-queue rows, `407` distinct normalized items, `0` linked normalized items, and `0` unresolved rows missing from the review queue.

-* [x] t1.16: cleanup review process and format
+* [X] t1.16: cleanup review process and format

 ** acceptance criteria
 1. Add intro text explaining:
@@ -718,8 +718,50 @@ replace the old observed/canonical workflow with a review-first pipeline that us
 - Direct numeric selection works well for suggestion-heavy review, while `[l]ink existing` remains available as a fallback when the suggestion list is empty or incomplete.
 - I kept the review data model unchanged from `t1.15`; this task only tightened the prompt format, field order, and save behavior.

+* [x] t1.16.1: add catalog search flow to review ui (2-3 commits)
+enable fast lookup of catalog items during review via tokenized search and replace manual list scanning
+
+** acceptance criteria
+1. replace `[l]ink existing` with `[s]earch` in review prompt:
+   - `[#] link to suggestion  [s]earch  [n]ew  [x]exclude  [q]uit >`
+2. implement search flow:
+   - on `s`, prompt: `search: `
+   - tokenize input using same normalization rules as suggestion matching
+   - return ranked list of catalog items where tokens overlap with:
+     - catalog_name
+     - product_type
+     - variant
+   - display results in same numbered format as suggestions:
+     [1] flour, flour, baking (12 items, 48 rows)
+3. allow direct selection from search results:
+   - when user inputs number, immediately creates approved resolution and product_links rows
+   - returns to next review item
+4. reuse match logic used for suggestion matching; no new matching system introduced
+   - future improvements to matching logic will therefore apply in both places
+5. search results exclude already-linked current normalized_item_id target
+6. fallback behavior:
+   - if no results, print `no matches found`
+   - allow retry or return to main prompt
+7. keep interaction tight:
+   - no full catalog dump
+   - max ~10 results returned
+   - sorted by simple score (token overlap count)
+8. persistence:
+   - selected link writes immediately to `product_links.csv`
+   - no buffering until script end
+
+- pm note: optimize for speed over correctness; this is a manual assist tool, not a ranking system
+- pm note: improve manual lookup flow only, don't retool or create a second algorithm
+** evidence
+- commit: `f93b9aa`
+- tests: `./venv/bin/python -m unittest discover -s tests`; `./venv/bin/python review_products.py --help`; `./venv/bin/python review_products.py --refresh-only`
+- datetime: 2026-03-20 13:32:09 EDT
+
+** notes
+- The search path reuses the same lightweight token matching rules as suggestion ranking, so there is still only one matching system to maintain.
+- Direct numeric suggestion-pick remains the fastest happy path; search is the fallback when suggestions are sparse or missing.
+- Search intentionally optimizes for manual speed rather than smart ranking: simple token overlap, max 10 rows, and immediate persistence on selection.

-#+END_*

 * [ ] 1t.10: add optional llm-assisted suggestion workflow for unresolved normalized retailer items (2-4 commits)