diff --git a/pm/tasks.org b/pm/tasks.org index 310ea2f..5bd48ac 100644 --- a/pm/tasks.org +++ b/pm/tasks.org @@ -927,7 +927,7 @@ beef patty by weight not made into effective price - Giant loose-weight rows already had deterministic `picked_weight` and `price_per_lb`; this task reuses that basis when parsed size/pack is absent. - Parsed package size still wins when present, so fixed-size products keep their original comparison basis and `normalized_item_id` behavior does not change. -* [x] t1.18.3: fix costco normalization quantity carry-through for weight-based items (1-3 commits) +* [X] t1.18.3: fix costco normalization quantity carry-through for weight-based items (1-3 commits) ** acceptance criteria 1. add regression tests covering known broken Costco quantity-basis cases before changing parser logic 2. Costco normalization correctly parses explicit weight-bearing package text into normalized quantity fields for known cases such as: @@ -962,6 +962,104 @@ Costco 25# FLOUR not parsed into normalized weight - meaure_type says each - Costco `25#` weight text was falling through to `each` because the hash-size parser missed sizes followed by whitespace. - This fix is intentionally narrow: explicit `#`-weight parsing now feeds the existing quantity and effective-price flow without changing `normalized_item_id` behavior. +* [x] t1.18.4: clean purchases output and finalize effective price fields (2-4 commits) +make `purchases.csv` easier to inspect and ensure price fields support weighted cost analysis + +** acceptance criteria +1. reorder `data/purchases.csv` columns for human inspection, with analysis fields first: + - `purchase_date` + - `retailer` + - `catalog_name` + - `product_type` + - `category` + - `net_line_total` + - `normalized_quantity` + - `effective_price` + - `effective_price_unit` + - followed by order/item/provenance fields +3. populate `net_line_total` for all purchase rows: + - preserve existing net_line_total when already populated; + - otherwise, derive `net_line_total = line_total + matched_discount_amount` when discount exists; + - else `net_line_total = line_total` +4. compute `effective_price` from `net_line_total / normalized_quantity` when `normalized_quantity > 0` +5. add `effective_price_unit` and populate it consistently from the normalized quantity basis +6. preserve blanks rather than writing `0` or divide-by-zero when no valid denominator exists +- pm note: this task is about final purchase output correctness and usability, not review/catalog logic + +** evidence +- commit: `a45522c` `Finalize purchase effective price fields` +- tests: `./venv/bin/python -m unittest tests.test_purchases`; `./venv/bin/python build_purchases.py` +- datetime: 2026-03-23 15:27:42 EDT + +** notes +- `purchases.csv` now carries a filled `net_line_total` for every row, preserving existing values from normalization and deriving the rest from `line_total` plus matched discounts. +- `effective_price_unit` now mirrors the normalized quantity basis, so downstream analysis can tell whether an `effective_price` is per `lb`, `oz`, `count`, or `each`. + +* [ ] t1.19: make review_products.py robust to orphaned and incomplete catalog links (2-4 commits) +refresh review state from the current normalized universe so missing or broken links re-enter review instead of silently disappearing + +** acceptance criteria +1. `review_products.py` regenerates review candidates from the current normalized item universe, not just previously queued items (/data//normalized_items.csv) +2. items are added or re-added to review when: + - they have no valid `catalog_id` + - their linked `catalog_id` no longer exists + - their linked catalog row does noth have both "catalog_name" AND "product_type" +3. `review_products.py` compares and reconciles: + - current normalized items + - current product_links + - current catalog + - current review_queue +4. rerunning review after manual cleanup of `product_links.csv` or `catalog.csv` surfaces newly orphaned normalized items +5. unresolved items remain visible and are not silently dropped from review or purchases accounting +- pm note: keep the logic explicit and auditable; this is a refresh/reconciliation task, not a new matching system + +** evidence +- commit: +- tests: +- datetime: + +** notes +* [ ] t1.20: add visit-level fields and outputs for spend analysis (2-4 commits) +ensure purchases retains enough visit/order context to support spend-by-visit and store-level analysis + +** acceptance criteria +1. `data/purchases.csv` retains or adds the visit/order fields needed for visit analysis: + - `order_id` + - `purchase_date` + - `store_name` + - `store_number` + - `store_city` + - `store_state` + - `retailer` +2. purchases output supports these analyses without additional joins: + - spend by visit + - items per visit + - category spend by visit + - retailer/store breakdown +3. documentation or task notes make clear that `purchases.csv` is the primary analysis artifact for both item-level and visit-level reporting +- pm note: do not build dash/plotly here; this task is only about carrying the right data through + +** evidence +- commit: +- tests: +- datetime: + +** notes + + +* [ ] t1.21: add lightweight charting/analysis surface on top of purchases.csv (2-4 commits) +build a minimal analysis layer for common price and visit charts without changing the csv pipeline + +** acceptance criteria +1. support charting of: + - item price over time + - spend by visit + - items per visit + - category spend over time + - retailer/store comparison +2. use `data/purchases.csv` as the source of truth +3. keep excel/pivot compatibility intact +- pm note: thin reader layer only; do not move business logic out of the pipeline * [ ] t1.10: add optional llm-assisted suggestion workflow for unresolved normalized retailer items (2-4 commits) ** acceptance criteria