Record t1.18.3 task evidence

This commit is contained in:
ben
2026-03-23 13:56:56 -04:00
parent 73176117fe
commit d78230f1c6

View File

@@ -888,7 +888,7 @@ correct purchases/effective price logic for the known broken cases using existin
- The implemented precedence is: use non-zero `net_line_total` when present, otherwise `line_total`; divide by `normalized_quantity` when that denominator is > 0; otherwise leave blank.
- This keeps the calculation conservative for mixed-quality data: Costco bananas and ice now compute correctly, while rows like Giant patties with no quantity basis stay blank instead of producing `0` or a divide-by-zero artifact.
* [x] t1.18.2: fix giant normalization quantity carry-through for weight-based items (1-3 commits)
* [X] t1.18.2: fix giant normalization quantity carry-through for weight-based items (1-3 commits)
ensure giant normalization emits usable normalized quantity for known weight-based cases
** acceptance criteria
@@ -926,7 +926,19 @@ beef patty by weight not made into effective price
** notes
- Giant loose-weight rows already had deterministic `picked_weight` and `price_per_lb`; this task reuses that basis when parsed size/pack is absent.
- Parsed package size still wins when present, so fixed-size products keep their original comparison basis and `normalized_item_id` behavior does not change.
* [ ] t1.18.3: fix costco normalization quantity carry-through for weight-based items (1-3 commits)
* [x] t1.18.3: fix costco normalization quantity carry-through for weight-based items (1-3 commits)
** acceptance criteria
1. add regression tests covering known broken Costco quantity-basis cases before changing parser logic
2. Costco normalization correctly parses explicit weight-bearing package text into normalized quantity fields for known cases such as:
- `25# FLOUR ALL-PURPOSE HARV ...` -> `normalized_quantity=25`, `normalized_quantity_unit=lb`, `measure_type=weight`
3. corrected Costco normalized rows carry through to `data/purchases.csv` without changing `normalized_item_id` behavior
4. `effective_price` for corrected Costco rows uses the same rule already established for Giant:
- use `net_line_total` when present, otherwise `line_total`
- divide by `normalized_quantity` when `normalized_quantity > 0`
- leave blank when no valid denominator exists
5. rerun output verifies the broken Costco flour examples no longer behave like `each` items and now produce non-blank weight-based effective prices
6. keep this task limited to the identified Costco parsing failures; do not broaden into catalog cleanup or fuzzy matching
*** All Purpose Flour
Costco 25# FLOUR not parsed into normalized weight - meaure_type says each
@@ -941,6 +953,15 @@ Costco 25# FLOUR not parsed into normalized weight - meaure_type says each
| 1/31/2026 | giant | SB FLOUR ALL PRPSE 5LB | all purpose flour | 1 | EA | 5 | lb | | 5 | lb | weight | 3.39 | 3.39 | | | VA | 3.39 | line_total_over_qty | | | 0.678 | parsed_size_lb | 0.0424 | parsed_size_lb_to_oz | 0.678 | FALSE | FALSE | FALSE | data/giant-web/raw/697f42031c28e23df08d95f9.json | |
| 3/12/2026 | costco | 25# FLOUR ALL-PURPOSE HARV P98/100 | all purpose flour | 1 | E | 1 | each | | | | each | 9.49 | 9.49 | | 9.49 | VA | 9.49 | line_total_over_qty | | | | | | | 9.49 | FALSE | FALSE | FALSE | data/costco-web/raw/21111500804012603121616-2026-03-12T16-16-00.json
| |
** evidence
- commit: `7317611` `Fix Costco hash-size weight parsing`
- tests: `./venv/bin/python -m unittest tests.test_costco_pipeline tests.test_purchases`; `./venv/bin/python normalize_costco_web.py`; `./venv/bin/python build_purchases.py`
- datetime: 2026-03-23 13:56:38 EDT
** notes
- Costco `25#` weight text was falling through to `each` because the hash-size parser missed sizes followed by whitespace.
- This fix is intentionally narrow: explicit `#`-weight parsing now feeds the existing quantity and effective-price flow without changing `normalized_item_id` behavior.
* [ ] t1.10: add optional llm-assisted suggestion workflow for unresolved normalized retailer items (2-4 commits)