Record Costco task evidence
This commit is contained in:
@@ -129,6 +129,7 @@ One row per retailer line item.
|
|||||||
| `order_id` | retailer order id |
|
| `order_id` | retailer order id |
|
||||||
| `line_no` | stable line number within order export |
|
| `line_no` | stable line number within order export |
|
||||||
| `order_date` | copied from order when available |
|
| `order_date` | copied from order when available |
|
||||||
|
| `retailer_item_id` | retailer-native item id when available |
|
||||||
| `pod_id` | retailer pod/item id |
|
| `pod_id` | retailer pod/item id |
|
||||||
| `item_name` | raw retailer item name |
|
| `item_name` | raw retailer item name |
|
||||||
| `upc` | retailer UPC or PLU value |
|
| `upc` | retailer UPC or PLU value |
|
||||||
@@ -145,6 +146,8 @@ One row per retailer line item.
|
|||||||
| `coupon_price` | retailer coupon price field |
|
| `coupon_price` | retailer coupon price field |
|
||||||
| `image_url` | raw retailer image url when present |
|
| `image_url` | raw retailer image url when present |
|
||||||
| `raw_order_path` | relative path to source order payload |
|
| `raw_order_path` | relative path to source order payload |
|
||||||
|
| `is_discount_line` | retailer adjustment or discount-line flag |
|
||||||
|
| `is_coupon_line` | coupon-like line flag when distinguishable |
|
||||||
|
|
||||||
Primary key:
|
Primary key:
|
||||||
|
|
||||||
@@ -161,6 +164,7 @@ fields from `items_raw.csv` and add parsed fields.
|
|||||||
| `order_id` | retailer order id |
|
| `order_id` | retailer order id |
|
||||||
| `line_no` | line number within order |
|
| `line_no` | line number within order |
|
||||||
| `observed_item_key` | stable row key, typically `<retailer>:<order_id>:<line_no>` |
|
| `observed_item_key` | stable row key, typically `<retailer>:<order_id>:<line_no>` |
|
||||||
|
| `retailer_item_id` | retailer-native item id |
|
||||||
| `item_name` | raw retailer item name |
|
| `item_name` | raw retailer item name |
|
||||||
| `item_name_norm` | normalized item name |
|
| `item_name_norm` | normalized item name |
|
||||||
| `brand_guess` | parsed brand guess |
|
| `brand_guess` | parsed brand guess |
|
||||||
@@ -171,6 +175,8 @@ fields from `items_raw.csv` and add parsed fields.
|
|||||||
| `measure_type` | `each`, `weight`, `volume`, `count`, or blank |
|
| `measure_type` | `each`, `weight`, `volume`, `count`, or blank |
|
||||||
| `is_store_brand` | store-brand guess |
|
| `is_store_brand` | store-brand guess |
|
||||||
| `is_fee` | fee or non-product flag |
|
| `is_fee` | fee or non-product flag |
|
||||||
|
| `is_discount_line` | discount or adjustment-line flag |
|
||||||
|
| `is_coupon_line` | coupon-like line flag |
|
||||||
| `price_per_each` | derived per-each price when supported |
|
| `price_per_each` | derived per-each price when supported |
|
||||||
| `price_per_lb` | derived per-pound price when supported |
|
| `price_per_lb` | derived per-pound price when supported |
|
||||||
| `price_per_oz` | derived per-ounce price when supported |
|
| `price_per_oz` | derived per-ounce price when supported |
|
||||||
@@ -191,6 +197,7 @@ One row per distinct retailer-facing observed product.
|
|||||||
| `observed_product_id` | stable observed product id |
|
| `observed_product_id` | stable observed product id |
|
||||||
| `retailer` | retailer slug |
|
| `retailer` | retailer slug |
|
||||||
| `observed_key` | deterministic grouping key used to create the observed product |
|
| `observed_key` | deterministic grouping key used to create the observed product |
|
||||||
|
| `representative_retailer_item_id` | best representative retailer-native item id |
|
||||||
| `representative_upc` | best representative UPC/PLU |
|
| `representative_upc` | best representative UPC/PLU |
|
||||||
| `representative_item_name` | representative raw retailer name |
|
| `representative_item_name` | representative raw retailer name |
|
||||||
| `representative_name_norm` | representative normalized name |
|
| `representative_name_norm` | representative normalized name |
|
||||||
@@ -203,11 +210,14 @@ One row per distinct retailer-facing observed product.
|
|||||||
| `representative_image_url` | representative image url |
|
| `representative_image_url` | representative image url |
|
||||||
| `is_store_brand` | representative store-brand flag |
|
| `is_store_brand` | representative store-brand flag |
|
||||||
| `is_fee` | representative fee flag |
|
| `is_fee` | representative fee flag |
|
||||||
|
| `is_discount_line` | representative discount-line flag |
|
||||||
|
| `is_coupon_line` | representative coupon-line flag |
|
||||||
| `first_seen_date` | first order date seen |
|
| `first_seen_date` | first order date seen |
|
||||||
| `last_seen_date` | last order date seen |
|
| `last_seen_date` | last order date seen |
|
||||||
| `times_seen` | number of enriched item rows grouped here |
|
| `times_seen` | number of enriched item rows grouped here |
|
||||||
| `example_order_id` | one example retailer order id |
|
| `example_order_id` | one example retailer order id |
|
||||||
| `example_item_name` | one example raw item name |
|
| `example_item_name` | one example raw item name |
|
||||||
|
| `distinct_retailer_item_ids_count` | count of distinct retailer-native item ids |
|
||||||
|
|
||||||
Primary key:
|
Primary key:
|
||||||
|
|
||||||
@@ -297,4 +307,3 @@ Current scraper outputs map to the new layout as follows:
|
|||||||
Current Giant raw order payloads already expose fields needed for future
|
Current Giant raw order payloads already expose fields needed for future
|
||||||
enrichment, including `image`, `itemName`, `primUpcCd`, `lbEachCd`,
|
enrichment, including `image`, `itemName`, `primUpcCd`, `lbEachCd`,
|
||||||
`unitPrice`, `groceryAmount`, and `totalPickedWeight`.
|
`unitPrice`, `groceryAmount`, and `totalPickedWeight`.
|
||||||
|
|
||||||
|
|||||||
32
pm/tasks.org
32
pm/tasks.org
@@ -143,7 +143,7 @@
|
|||||||
- tests: `./venv/bin/python -m unittest discover -s tests`; `./venv/bin/python build_canonical_layer.py`; verified auto-linked `giant_output/products_canonical.csv` and `giant_output/product_links.csv`
|
- tests: `./venv/bin/python -m unittest discover -s tests`; `./venv/bin/python build_canonical_layer.py`; verified auto-linked `giant_output/products_canonical.csv` and `giant_output/product_links.csv`
|
||||||
- date: 2026-03-16
|
- date: 2026-03-16
|
||||||
|
|
||||||
* [ ] t1.8: support costco raw ingest path (2-5 commits)
|
* [X] t1.8: support costco raw ingest path (2-5 commits)
|
||||||
|
|
||||||
** acceptance criteria
|
** acceptance criteria
|
||||||
- add a costco-specific raw ingest/export path
|
- add a costco-specific raw ingest/export path
|
||||||
@@ -158,11 +158,11 @@
|
|||||||
- bearer/auth values should come from local env, not source
|
- bearer/auth values should come from local env, not source
|
||||||
|
|
||||||
** evidence
|
** evidence
|
||||||
- commit:
|
- commit: `da00288` on branch `cx`
|
||||||
- tests:
|
- tests: `./venv/bin/python -m unittest discover -s tests`; `./venv/bin/python scrape_costco.py --help`; verified `costco_output/raw/*.json`, `costco_output/orders.csv`, and `costco_output/items.csv` from the local sample payload
|
||||||
- date:
|
- date: 2026-03-16
|
||||||
|
|
||||||
* [ ] t1.8.1: support costco parser/enricher path (2-4 commits)
|
* [X] t1.8.1: support costco parser/enricher path (2-4 commits)
|
||||||
|
|
||||||
** acceptance criteria
|
** acceptance criteria
|
||||||
- add a costco-specific enrich step producing `costco_output/items_enriched.csv`
|
- add a costco-specific enrich step producing `costco_output/items_enriched.csv`
|
||||||
@@ -179,10 +179,10 @@
|
|||||||
- expect weaker identifiers than Giant
|
- expect weaker identifiers than Giant
|
||||||
|
|
||||||
** evidence
|
** evidence
|
||||||
- commit:
|
- commit: `da00288` on branch `cx`
|
||||||
- tests:
|
- tests: `./venv/bin/python -m unittest discover -s tests`; `./venv/bin/python enrich_costco.py`; verified `costco_output/items_enriched.csv`
|
||||||
- date:
|
- date: 2026-03-16
|
||||||
* [ ] t1.8.2: validate cross-retailer observed/canonical flow (1-3 commits)
|
* [X] t1.8.2: validate cross-retailer observed/canonical flow (1-3 commits)
|
||||||
|
|
||||||
** acceptance criteria
|
** acceptance criteria
|
||||||
- feed Giant and Costco enriched rows through the same observed/canonical pipeline
|
- feed Giant and Costco enriched rows through the same observed/canonical pipeline
|
||||||
@@ -197,10 +197,10 @@
|
|||||||
- apples, eggs, bananas, or flour are better than weird prepared foods
|
- apples, eggs, bananas, or flour are better than weird prepared foods
|
||||||
|
|
||||||
** evidence
|
** evidence
|
||||||
- commit:
|
- commit: `da00288` on branch `cx`
|
||||||
- tests:
|
- tests: `./venv/bin/python -m unittest discover -s tests`; `./venv/bin/python validate_cross_retailer_flow.py`; proof example: Giant `FRESH BANANA` and Costco `BANANAS 3 LB / 1.36 KG` share one canonical in `combined_output/proof_examples.csv`
|
||||||
- date:
|
- date: 2026-03-16
|
||||||
* [ ] t1.8.3: extend shared schema for retailer-native ids and adjustment lines (1-2 commits)
|
* [X] t1.8.3: extend shared schema for retailer-native ids and adjustment lines (1-2 commits)
|
||||||
|
|
||||||
** acceptance criteria
|
** acceptance criteria
|
||||||
- add shared fields needed for non-upc retailers, including:
|
- add shared fields needed for non-upc retailers, including:
|
||||||
@@ -215,9 +215,9 @@
|
|||||||
- do this once instead of sprinkling exceptions everywhere
|
- do this once instead of sprinkling exceptions everywhere
|
||||||
|
|
||||||
** evidence
|
** evidence
|
||||||
- commit:
|
- commit: `9497565` on branch `cx`
|
||||||
- tests:
|
- tests: `./venv/bin/python -m unittest discover -s tests`; verified shared enriched fields in `giant_output/items_enriched.csv` and `costco_output/items_enriched.csv`
|
||||||
- date:
|
- date: 2026-03-16
|
||||||
* [ ] t1.9: compute normalized comparison metrics (2-4 commits)
|
* [ ] t1.9: compute normalized comparison metrics (2-4 commits)
|
||||||
|
|
||||||
** acceptance criteria
|
** acceptance criteria
|
||||||
|
|||||||
Reference in New Issue
Block a user