Record t1.13 task evidence
This commit is contained in:
56
pm/tasks.org
56
pm/tasks.org
@@ -416,7 +416,61 @@ Clearly show current state separate from proposed future state.
|
||||
- Numbered canonical selection plus confirmation worked better than free-text id entry and should reduce accidental links.
|
||||
- Deterministic suggestions remain intentionally conservative; they speed up common cases, but unresolved items still depend on human review by design.
|
||||
|
||||
* [ ] t1.10: add optional llm-assisted suggestion workflow for unresolved products (2-4 commits)
|
||||
* [X] t1.13.1 pipeline accountability and stage visibility (1-2 commits)
|
||||
add simple accounting so we can see what survives or drops at each pipeline stage
|
||||
|
||||
** AC
|
||||
1. emit counts for raw, enriched, combined/observed, review-queued, canonical-linked, and final purchase-log rows
|
||||
2. report unresolved and dropped item counts explicitly
|
||||
3. make it easy to verify that missing items were intentionally left in review rather than silently lost
|
||||
- pm note: simple text/json/csv summary is sufficient; trust and visibility matter more than presentation
|
||||
|
||||
** evidence
|
||||
- commit:
|
||||
- tests: `./venv/bin/python -m unittest discover -s tests`; `./venv/bin/python report_pipeline_status.py --help`; `./venv/bin/python report_pipeline_status.py`; verified `combined_output/pipeline_status.csv` and `combined_output/pipeline_status.json`
|
||||
- date: 2026-03-17
|
||||
|
||||
** notes
|
||||
- Added a single explicit status script instead of threading counters through every pipeline step; this keeps the pipeline simple while still making row survival visible.
|
||||
- The most useful check here is `unresolved_not_in_review_rows`; when it is non-zero, we know we have a real accounting bug rather than normal unresolved work.
|
||||
|
||||
* [X] t1.13.2 costco discount matching and net pricing in enrich_costco (2-3 commits)
|
||||
refactor costco enrichment so discount lines are matched to purchased items and net pricing is preserved
|
||||
|
||||
** AC
|
||||
1. detect costco discount/coupon rows like `/<retailer_item_id>` and match them to purchased items within the same order
|
||||
2. preserve raw discount rows for auditability while also carrying matched discount values onto the purchased item row
|
||||
3. add explicit fields for discount-adjusted pricing, e.g. `matched_discount_amount` and `net_line_total` (or equivalent)
|
||||
4. preserve original raw receipt amounts (`line_total`) without overwriting them
|
||||
- pm note: keep this retailer-specific and explicit; do not introduce generic discount heuristics
|
||||
|
||||
** evidence
|
||||
- commit:
|
||||
- tests: `./venv/bin/python -m unittest discover -s tests`; `./venv/bin/python enrich_costco.py`; verified matched Costco discount rows now populate `matched_discount_amount` and `net_line_total` while preserving raw `line_total`
|
||||
- date: 2026-03-17
|
||||
|
||||
** notes
|
||||
- Kept this retailer-specific and literal: only discount rows with `/<retailer_item_id>` are matched, and only within the same order.
|
||||
- Raw discount rows are still preserved for auditability; the purchased row now carries the matched adjustment separately rather than overwriting the original amount.
|
||||
* [X] t1.13.3 canonical cleanup and review-first product identity (3-4 commits)
|
||||
refactor canonical generation so product identity is cleaner, duplicate canonicals are reduced, and unresolved items stay in review instead of spawning junk canonicals
|
||||
|
||||
** AC
|
||||
1. stop auto-creating new canonical products from weak normalized names alone; unresolved items remain in `review_queue.csv`
|
||||
2. canonical names are based on stable product identity rather than noisy observed titles
|
||||
3. packaging/count/size tokens are removed from canonical names when they belong in structured fields (`pack_qty`, `size_value`, `size_unit`)
|
||||
4. consolidate obvious duplicate canonicals (e.g. egg/lime cases) and ensure final outputs retain raw item name, normalized item name, and canonical item id
|
||||
- pm note: prefer conservative canonical creation and a better manual review loop over aggressive auto-unification
|
||||
|
||||
** evidence
|
||||
- commit:
|
||||
- tests: `./venv/bin/python -m unittest discover -s tests`; `./venv/bin/python build_purchases.py`; `./venv/bin/python review_products.py --refresh-only`; verified weaker exact-name cases now remain unresolved in `combined_output/review_queue.csv` and canonical names are cleaned before auto-catalog creation
|
||||
- date: 2026-03-17
|
||||
|
||||
** notes
|
||||
- Removed weak exact-name auto-canonical creation so ambiguous products stay in review instead of generating junk canonicals.
|
||||
- Canonical display names are now cleaned of obvious punctuation and packaging noise, but I kept the cleanup conservative rather than adding a broad fuzzy merge layer.
|
||||
* [ ] 1t.10: add optional llm-assisted suggestion workflow for unresolved products (2-4 commits)
|
||||
|
||||
** acceptance criteria
|
||||
- llm suggestions are generated only for unresolved observed products
|
||||
|
||||
Reference in New Issue
Block a user