added effective_price and testing to id upstream data
This commit is contained in:
106
pm/tasks.org
106
pm/tasks.org
@@ -763,7 +763,7 @@ enable fast lookup of catalog items during review via tokenized search and repla
|
||||
- Search intentionally optimizes for manual speed rather than smart ranking: simple token overlap, max 10 rows, and immediate persistence on selection.
|
||||
- Follow-up fix: search moved to `[f]ind` so `[s]kip` remains available at the main prompt.
|
||||
|
||||
* [x] t1.17: fix normalized quantity derivation and carry it through purchases (2-4 commits)
|
||||
* [X] t1.17: fix normalized quantity derivation and carry it through purchases (2-4 commits)
|
||||
correct and document deterministic normalized quantity fields so unit-cost analysis works across package sizes
|
||||
|
||||
** Acceptance Criteria
|
||||
@@ -803,7 +803,109 @@ correct and document deterministic normalized quantity fields so unit-cost analy
|
||||
- The missing purchases fields were a carry-through bug: normalization had `normalized_quantity` and `normalized_quantity_unit`, but `build_purchases.py` never wrote them into `data/review/purchases.csv`.
|
||||
- Normalized quantity now prefers explicit package basis over `each`, so rows like `PEPSI 6PK 7.5Z` resolve to `90 oz` and `KS ALMND BAR US 1.74QTS` purchased twice resolves to `3.48 qt`.
|
||||
- The derivation stays conservative and does not convert units during normalization; parsed units such as `oz`, `lb`, `qt`, and `count` are preserved as-is.
|
||||
* [ ] 1t.10: add optional llm-assisted suggestion workflow for unresolved normalized retailer items (2-4 commits)
|
||||
* [ ] t1.18: add regression tests for known quantity/price failures (1-2 commits)
|
||||
capture the currently broken comparison cases before changing normalization or purchases logic
|
||||
|
||||
** acceptance criteria
|
||||
1. when generating `data/purchases.csv`, add `effective_price` = `effective_total` / `normalized_quantity`
|
||||
2. define `effective_price` behavior explicitly from the covered cases:
|
||||
- use `net_line_total` when present and non-zero, else use `line_total`
|
||||
- divide by `normalized_quantity` when `normalized_quantity > 0`
|
||||
- leave blank when no valid denominator exists
|
||||
- never emit `0` or divide-by-zero for missing-basis cases
|
||||
- `effective_price` only comparable within same `normalized_quantity_unit` unless later analysis converts the units
|
||||
3. ensure the new tests assert the intended `effective_price` behavior for the known banana, ice, and beef patty examples
|
||||
4. add tests covering known broken cases:
|
||||
- giant bananas produce non-blank effective price
|
||||
- giant bagged ice produces non-zero effective price
|
||||
- costco bananas retain correct effective price
|
||||
- beef patty comparison rows preserve expected quantity basis behavior
|
||||
5. tests fail against current broken behavior and document the expected outcome
|
||||
6. include at least one assertion that effective_price is blank rather than `0` or divide-by-zero when no denominator exists
|
||||
7. pm note: this task should only add tests/fixtures and not change business logic
|
||||
** pm identified problems
|
||||
we have a few problems to scope. looks like:
|
||||
1. normalize_giant_web not always propagating weight data to price_per
|
||||
2. effective_price calc needs more robust matching algo (my excel hack is clearly not engouh)
|
||||
```
|
||||
catalog_name banana
|
||||
Average of effective_price Column Labels
|
||||
Row Labels 8/6/2024 12/6/2024 12/12/2024 1/7/2025 1/24/2025 2/16/2025 2/20/2025 6/25/2025 2/14/2026 3/12/2026 Grand Total
|
||||
Jan #DIV/0! 0.496666667 #DIV/0!
|
||||
Feb #DIV/0! #DIV/0! 0.496666667 #DIV/0!
|
||||
Mar 0.496666667 0.496666667
|
||||
Jun #DIV/0! #DIV/0!
|
||||
Aug 0.496666667 0.496666667
|
||||
Dec #DIV/0! #DIV/0! #DIV/0!
|
||||
Grand Total 0.496666667 #DIV/0! #DIV/0! #DIV/0! 0.496666667 #DIV/0! #DIV/0! #DIV/0! 0.496666667 0.496666667 #DIV/0!
|
||||
|
||||
purchase_date retailer normalized_item_name catalog_name category product_type qty unit normalized_quantity normalized_quantity_unit pack_qty size_value size_unit measure_type line_total unit_price net_line_total price_per_each price_per_each_basis price_per_count price_per_count_basis price_per_lb price_per_lb_basis price_per_oz price_per_oz_basis effective_price
|
||||
8/6/2024 costco BANANA banana produce banana 1 E 3 lb 3 lb weight 1.49 1.49 1.49 1.49 line_total_over_qty 0.4967 parsed_size_lb 0.031 parsed_size_lb_to_oz 0.496666667
|
||||
12/6/2024 giant BANANA banana produce banana 1 LB weight 0.99 0.99 0.99 line_total_over_qty 0.5893 picked_weight_lb 0.0368 picked_weight_lb_to_oz #DIV/0!
|
||||
12/12/2024 giant BANANA banana produce banana 1 LB weight 1.37 1.37 1.37 line_total_over_qty 0.5905 picked_weight_lb 0.0369 picked_weight_lb_to_oz #DIV/0!
|
||||
1/7/2025 giant BANANA banana produce banana 1 LB weight 1.44 1.44 1.44 line_total_over_qty 0.5902 picked_weight_lb 0.0369 picked_weight_lb_to_oz #DIV/0!
|
||||
1/24/2025 costco BANANA banana produce banana 1 E 3 lb 3 lb weight 1.49 1.49 1.49 1.49 line_total_over_qty 0.4967 parsed_size_lb 0.031 parsed_size_lb_to_oz 0.496666667
|
||||
2/16/2025 giant BANANA banana produce banana 2 LB weight 2.54 1.27 1.27 line_total_over_qty 0.588 picked_weight_lb 0.0367 picked_weight_lb_to_oz #DIV/0!
|
||||
2/20/2025 giant BANANA banana produce banana 1 LB weight 1.4 1.4 1.4 line_total_over_qty 0.5907 picked_weight_lb 0.0369 picked_weight_lb_to_oz #DIV/0!
|
||||
6/25/2025 giant BANANA banana produce banana 1 LB weight 1.29 1.29 1.29 line_total_over_qty 0.589 picked_weight_lb 0.0368 picked_weight_lb_to_oz #DIV/0!
|
||||
2/14/2026 costco BANANA banana produce banana 1 E 3 lb 3 lb weight 1.49 1.49 1.49 1.49 line_total_over_qty 0.4967 parsed_size_lb 0.031 parsed_size_lb_to_oz 0.496666667
|
||||
3/12/2026 costco BANANA banana produce banana 2 E 6 lb 3 lb weight 2.98 1.49 2.98 1.49 line_total_over_qty 0.4967 parsed_size_lb 0.031 parsed_size_lb_to_oz 0.496666667
|
||||
|
||||
purchase_date retailer normalized_item_name catalog_name category product_type qty unit normalized_quantity normalized_quantity_unit pack_qty size_value size_unit measure_type line_total unit_price net_line_total price_per_each price_per_each_basis price_per_count price_per_count_basis price_per_lb price_per_lb_basis price_per_oz price_per_oz_basis effective_price
|
||||
9/9/2023 costco BEEF PATTIES 6# BAG beef patty meat hamburger 1 E 1 each each 26.99 26.99 26.99 26.99 line_total_over_qty 26.99
|
||||
11/26/2025 giant 80% PATTIES PK12 beef patty meat hamburger 1 LB weight 10.05 10.05 10.05 line_total_over_qty 7.7907 picked_weight_lb 0.4869 picked_weight_lb_to_oz #DIV/0!
|
||||
|
||||
purchase_date retailer normalized_item_name catalog_name category product_type qty unit normalized_quantity normalized_quantity_unit pack_qty size_value size_unit measure_type line_total unit_price net_line_total price_per_each price_per_each_basis price_per_count price_per_count_basis price_per_lb price_per_lb_basis price_per_oz price_per_oz_basis effective_price
|
||||
5/26/2025 giant BAGGED ICE bagged ice cubes frozen ice 2 EA 40 lb 20 lb weight 9.98 4.99 4.99 line_total_over_qty 0.2495 parsed_size_lb 0.0156 parsed_size_lb_to_oz 0
|
||||
6/12/2025 giant BAG ICE CUBED bagged ice cubes frozen ice 1 EA 10 lb 10 lb weight 3.49 3.49 3.49 line_total_over_qty 0.349 parsed_size_lb 0.0218 parsed_size_lb_to_oz 0
|
||||
9/13/2025 giant BAGGED ICE bagged ice cubes frozen ice 2 EA 20 lb 10 lb weight 6.98 3.49 3.49 line_total_over_qty 0.349 parsed_size_lb 0.0218 parsed_size_lb_to_oz 0
|
||||
10/10/2025 giant BAGGED ICE bagged ice cubes frozen ice 1 EA 20 lb 20 lb weight 4.99 4.99 4.99 line_total_over_qty 0.2495 parsed_size_lb 0.0156 parsed_size_lb_to_oz 0
|
||||
```
|
||||
** evidence
|
||||
- commit:
|
||||
- tests:
|
||||
- datetime:
|
||||
|
||||
** notes
|
||||
|
||||
* [ ] t1.18.1: fix effective price calculation precedence and blank handling (1-3 commits)
|
||||
correct purchases/effective price logic for the known broken cases using existing normalized fields
|
||||
|
||||
** acceptance criteria
|
||||
1. effective_price uses explicit numerator precedence:
|
||||
- prefer `net_line_total`
|
||||
- fallback to `line_total`
|
||||
2. effective_price uses `normalized_quantity` when present and > 0
|
||||
3. effective_price is blank when no valid denominator exists
|
||||
4. effective_price is never written as `0` or divide-by-zero for missing-basis cases
|
||||
5. existing regression tests for bananas and ice pass
|
||||
- pm note: keep this limited to calculation logic; do not broaden into catalog or review changes
|
||||
|
||||
** evidence
|
||||
- commit:
|
||||
- tests:
|
||||
- datetime:
|
||||
|
||||
** notes
|
||||
|
||||
|
||||
* [ ] t1.18.2: fix giant normalization quantity carry-through for weight-based items (1-3 commits)
|
||||
ensure giant normalization emits usable normalized quantity for known weight-based cases
|
||||
|
||||
** acceptance criteria
|
||||
1. giant bananas populate normalized quantity and unit from deterministic weight basis
|
||||
2. giant weight-based items that already produce `price_per_lb` also carry enough quantity basis for effective price calculation where supported
|
||||
3. existing regression tests pass without changing normalized_item_id behavior
|
||||
4. blanks are preserved only when no deterministic quantity basis exists
|
||||
- pm note: this task is about normalization carry-through, not fuzzy matching or catalog cleanup
|
||||
|
||||
** evidence
|
||||
- commit:
|
||||
- tests:
|
||||
- datetime:
|
||||
|
||||
** notes
|
||||
* [ ] t1.10: add optional llm-assisted suggestion workflow for unresolved normalized retailer items (2-4 commits)
|
||||
|
||||
** acceptance criteria
|
||||
- llm suggestions are generated only for unresolved normalized retailer items
|
||||
|
||||
Reference in New Issue
Block a user