Record t1.15 task evidence

This commit is contained in:
ben
2026-03-20 11:27:56 -04:00
parent 9104781b93
commit 59fb881c0a

View File

@@ -624,7 +624,7 @@ tighten Costco-specific normalization so normalized item names are cleaner and d
- The structured parsing still owns size/pack extraction, so name cleanup can safely strip dual-unit and logistics fragments after those fields are parsed.
- Discount-line behavior remains unchanged; this task only cleaned normalized names and preserved the existing audit trail.
* [ ] t1.15: refactor review/combine pipeline around normalized_item_id and catalog links (4-8 commits)
* [x] t1.15: refactor review/combine pipeline around normalized_item_id and catalog links (4-8 commits)
replace the old observed/canonical workflow with a review-first pipeline that uses normalized_item_id as the retailer-level review unit and links it to catalog items
** Acceptance Criteria
@@ -667,11 +667,15 @@ replace the old observed/canonical workflow with a review-first pipeline that us
9. pm note: keep review/combine auditable; each catalog link should be explainable from normalized rows and review state
** evidence
- commit:
- tests:
- datetime:
- commit: `9104781`
- tests: `./venv/bin/python -m unittest discover -s tests`; `./venv/bin/python build_purchases.py`; `./venv/bin/python review_products.py --refresh-only`; `./venv/bin/python report_pipeline_status.py`; `./venv/bin/python build_purchases.py --help`; `./venv/bin/python review_products.py --help`; `./venv/bin/python report_pipeline_status.py --help`
- datetime: 2026-03-20 11:27:12 EDT
** notes
- The old observed/canonical auto-layer is no longer in the active review/combine path. `build_purchases.py`, `review_products.py`, and `report_pipeline_status.py` now operate on `normalized_item_id`, `catalog_id`, and `catalog_name`.
- I kept the review CLI shape intentionally close to the pre-refactor flow so the project only changed its identity model, not the operator workflow.
- Existing auto-generated catalog rows are no longer carried forward by default; only deliberate catalog entries survive. That keeps the new `catalog.csv` conservative, but it also means prior observed-based auto-links do not migrate into the new model.
- Live rerun after the refactor produced `627` purchase rows, `387` review-queue rows, `407` distinct normalized items, `0` linked normalized items, and `0` unresolved rows missing from the review queue.
* [ ] 1t.10: add optional llm-assisted suggestion workflow for unresolved normalized retailer items (2-4 commits)