From d78230f1c6e476fd6d0cab93842f9902c925ec7c Mon Sep 17 00:00:00 2001 From: ben Date: Mon, 23 Mar 2026 13:56:56 -0400 Subject: [PATCH] Record t1.18.3 task evidence --- pm/tasks.org | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-) diff --git a/pm/tasks.org b/pm/tasks.org index 31b6efa..310ea2f 100644 --- a/pm/tasks.org +++ b/pm/tasks.org @@ -888,7 +888,7 @@ correct purchases/effective price logic for the known broken cases using existin - The implemented precedence is: use non-zero `net_line_total` when present, otherwise `line_total`; divide by `normalized_quantity` when that denominator is > 0; otherwise leave blank. - This keeps the calculation conservative for mixed-quality data: Costco bananas and ice now compute correctly, while rows like Giant patties with no quantity basis stay blank instead of producing `0` or a divide-by-zero artifact. -* [x] t1.18.2: fix giant normalization quantity carry-through for weight-based items (1-3 commits) +* [X] t1.18.2: fix giant normalization quantity carry-through for weight-based items (1-3 commits) ensure giant normalization emits usable normalized quantity for known weight-based cases ** acceptance criteria @@ -926,7 +926,19 @@ beef patty by weight not made into effective price ** notes - Giant loose-weight rows already had deterministic `picked_weight` and `price_per_lb`; this task reuses that basis when parsed size/pack is absent. - Parsed package size still wins when present, so fixed-size products keep their original comparison basis and `normalized_item_id` behavior does not change. -* [ ] t1.18.3: fix costco normalization quantity carry-through for weight-based items (1-3 commits) + +* [x] t1.18.3: fix costco normalization quantity carry-through for weight-based items (1-3 commits) +** acceptance criteria +1. add regression tests covering known broken Costco quantity-basis cases before changing parser logic +2. Costco normalization correctly parses explicit weight-bearing package text into normalized quantity fields for known cases such as: + - `25# FLOUR ALL-PURPOSE HARV ...` -> `normalized_quantity=25`, `normalized_quantity_unit=lb`, `measure_type=weight` +3. corrected Costco normalized rows carry through to `data/purchases.csv` without changing `normalized_item_id` behavior +4. `effective_price` for corrected Costco rows uses the same rule already established for Giant: + - use `net_line_total` when present, otherwise `line_total` + - divide by `normalized_quantity` when `normalized_quantity > 0` + - leave blank when no valid denominator exists +5. rerun output verifies the broken Costco flour examples no longer behave like `each` items and now produce non-blank weight-based effective prices +6. keep this task limited to the identified Costco parsing failures; do not broaden into catalog cleanup or fuzzy matching *** All Purpose Flour Costco 25# FLOUR not parsed into normalized weight - meaure_type says each @@ -941,6 +953,15 @@ Costco 25# FLOUR not parsed into normalized weight - meaure_type says each | 1/31/2026 | giant | SB FLOUR ALL PRPSE 5LB | all purpose flour | 1 | EA | 5 | lb | | 5 | lb | weight | 3.39 | 3.39 | | | VA | 3.39 | line_total_over_qty | | | 0.678 | parsed_size_lb | 0.0424 | parsed_size_lb_to_oz | 0.678 | FALSE | FALSE | FALSE | data/giant-web/raw/697f42031c28e23df08d95f9.json | | | 3/12/2026 | costco | 25# FLOUR ALL-PURPOSE HARV P98/100 | all purpose flour | 1 | E | 1 | each | | | | each | 9.49 | 9.49 | | 9.49 | VA | 9.49 | line_total_over_qty | | | | | | | 9.49 | FALSE | FALSE | FALSE | data/costco-web/raw/21111500804012603121616-2026-03-12T16-16-00.json | | +** evidence +- commit: `7317611` `Fix Costco hash-size weight parsing` +- tests: `./venv/bin/python -m unittest tests.test_costco_pipeline tests.test_purchases`; `./venv/bin/python normalize_costco_web.py`; `./venv/bin/python build_purchases.py` +- datetime: 2026-03-23 13:56:38 EDT + +** notes +- Costco `25#` weight text was falling through to `each` because the hash-size parser missed sizes followed by whitespace. +- This fix is intentionally narrow: explicit `#`-weight parsing now feeds the existing quantity and effective-price flow without changing `normalized_item_id` behavior. + * [ ] t1.10: add optional llm-assisted suggestion workflow for unresolved normalized retailer items (2-4 commits) ** acceptance criteria