From d20a131e048c23d3021e938aaeb2b95f5f7fa4e0 Mon Sep 17 00:00:00 2001 From: ben Date: Mon, 16 Mar 2026 09:04:52 -0400 Subject: [PATCH] updated scope to prep for costco scraper --- README.md | 103 ++++++++++++++++++++++++++++++++++++++++++++ pm/scrape-giant.org | 78 +++++++++++++++++++++++++++++++-- pm/tasks.org | 95 ++++++++++++++++++++++++++++++++-------- 3 files changed, 256 insertions(+), 20 deletions(-) create mode 100644 README.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..f593c0d --- /dev/null +++ b/README.md @@ -0,0 +1,103 @@ +# scrape-giant + +Small grocery-history pipeline for Giant receipts. + +The project currently does four things: + +1. scrape Giant in-store order history from an active Firefox session +2. enrich raw line items into a deterministic `items_enriched.csv` +3. aggregate retailer-facing observed products and build a manual review queue +4. create a first-pass canonical product layer plus conservative auto-links + +The work so far is Giant-specific on the ingest side and intentionally simple on +the shared product-model side. + +## Current flow + +Run the commands from the repo root with the project venv active, or call them +directly through `./venv/bin/python`. + +```bash +./venv/bin/python scraper.py +./venv/bin/python enrich_giant.py +./venv/bin/python build_observed_products.py +./venv/bin/python build_review_queue.py +./venv/bin/python build_canonical_layer.py +``` + +## Inputs + +- Firefox cookies for `giantfood.com` +- `GIANT_USER_ID` and `GIANT_LOYALTY_NUMBER` in `.env`, shell env, or prompts +- Giant raw order payloads in `giant_output/raw/` + +## Outputs + +Current generated files live under `giant_output/`: + +- `orders.csv`: flattened visit/order rows from the Giant history API +- `items.csv`: flattened raw line items from fetched order detail payloads +- `items_enriched.csv`: deterministic parsed/enriched line items +- `products_observed.csv`: retailer-facing observed product groups +- `review_queue.csv`: products needing manual review +- `products_canonical.csv`: shared canonical product rows +- `product_links.csv`: observed-to-canonical links + +Raw json remains the source of truth: + +- `giant_output/raw/history.json` +- `giant_output/raw/.json` + +## Scripts + +- `scraper.py`: fetches Giant history/detail payloads and updates `orders.csv` and `items.csv` +- `enrich_giant.py`: reads raw Giant order json and writes `items_enriched.csv` +- `build_observed_products.py`: groups enriched rows into `products_observed.csv` +- `build_review_queue.py`: generates `review_queue.csv` and preserves review status on reruns +- `build_canonical_layer.py`: builds `products_canonical.csv` and `product_links.csv` + +## Notes on the current model + +- Observed products are retailer-specific: Giant, Costco. +- Canonical products are the first cross-retailer layer. +- Auto-linking is conservative: + exact UPC first, then exact normalized name plus exact size/unit context, then + exact normalized name when there is no size context to conflict. +- Fee rows are excluded from auto-linking. +- Unknown values are left blank instead of guessed. + +## Verification + +Run the test suite with: + +```bash +./venv/bin/python -m unittest discover -s tests +``` + +Useful one-off rebuilds: + +```bash +./venv/bin/python enrich_giant.py +./venv/bin/python build_observed_products.py +./venv/bin/python build_review_queue.py +./venv/bin/python build_canonical_layer.py +``` + +## Project docs + +- `pm/tasks.org`: task log and evidence +- `pm/data-model.org`: file layout and schema decisions + +## Status + +Completed through `t1.7`: + +- Giant receipt fetch CLI +- data model and file layout +- Giant parser/enricher +- observed products +- review queue +- canonical layer scaffold +- conservative auto-link rules + +Next planned task is `t1.8`: add a Costco raw ingest path. diff --git a/pm/scrape-giant.org b/pm/scrape-giant.org index 2c3cf99..91bad14 100644 --- a/pm/scrape-giant.org +++ b/pm/scrape-giant.org @@ -26,8 +26,8 @@ carry forward image url 3. build observed-product atble from enriched items - -* item: +* giant requests +** item: get: /api/v6.0/user/369513017/order/history/detail/69a2e44a16be1142e74ad3cc @@ -66,7 +66,7 @@ x-datadome: protected request-context: appId=cid-v1:75750625-0c81-4f08-9f5d-ce4f73198e54 X-Firefox-Spdy: h2 -* history: +** history: GET https://giantfood.com/api/v6.0/user/369513017/order/history?filter=instore&loyaltyNumber=440155630880 @@ -105,3 +105,75 @@ accept-ch: Sec-CH-UA,Sec-CH-UA-Mobile,Sec-CH-UA-Platform,Sec-CH-UA-Arch,Sec-CH-U x-datadome: protected request-context: appId=cid-v1:75750625-0c81-4f08-9f5d-ce4f73198e54 X-Firefox-Spdy: h2 + +* costco requests +** warehouse +*** POST +https://ecom-api.costco.com/ebusiness/order/v1/orders/graphql + +*** Headers + +POST /ebusiness/order/v1/orders/graphql HTTP/1.1 +Host: ecom-api.costco.com +User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:148.0) Gecko/20100101 Firefox/148.0 +Accept: */* +Accept-Language: en-US,en;q=0.9 +Accept-Encoding: gzip, deflate, br, zstd +costco.service: restOrders +costco.env: ecom +costco-x-authorization: Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IlhrZTFoNXg5TV9ZMk5ER0YxU1hDX2xNNnVSTU5tZTJ3STBLRDlHNzl1QmciLCJ0eXAiOiJKV1QifQ.eyJleHAiOjE3NzM2NjU2NjgsIm5iZiI6MTc3MzY2NDc2OCwidmVyIjoiMS4wIiwiaXNzIjoiaHR0cHM6Ly9zaWduaW4uY29zdGNvLmNvbS9lMDcxNGRkNC03ODRkLTQ2ZDYtYTI3OC0zZTI5NTUzNDgzZWIvdjIuMC8iLCJzdWIiOiIzMTIzZWQ2Yy1jNzM4LTRiOTktOTAwZC0xNDE1ZTUzNjA2Y2UiLCJhdWQiOiJhM2E1MTg2Yi03Yzg5LTRiNGMtOTNhOC1kZDYwNGU5MzA3NTciLCJhY3IiOiJCMkNfMUFfU1NPX1dDU19zaWdudXBfc2lnbmluXzIwMSIsIm5vbmNlIjoiNDA4NjU3YmItODg5MC00MTk0LTg2OTctZDYzOGU2MzdhMGRhIiwiaWF0IjoxNzczNjY0NzY4LCJhdXRoX3RpbWUiOjE3NzM2NjQ3NjgsImF1dGhlbnRpY2F0aW9uU291cmNlIjoibG9jYWxBY2NvdW50QXV0aGVudGljYXRpb24iLCJlbWFpbCI6ImpvaG5tb3Nlc2NhcnRlckBnbWFpbC5jb20iLCJuYW1lIjoiRW1wdHkgRGlzcGxheW5hbWUiLCJ1c2VySWRlbnRpdGllcyI6W3siaXNzdWVyIjoiYTNhNTE4NmItN2M4OS00YjRjLTkzYTgtZGQ2MDRlOTMwNzU3IiwiaXNzdWVyVXNlcklkIjoiQUFEOjMxMjNlZDZjLWM3MzgtNGI5OS05MDBkLTE0MTVlNTM2MDZjZSJ9LHsiaXNzdWVyIjoiNDkwMGViMWYtMGMxMC00YmQ5LTk5YzMtYzU5ZTZjMWVjZWJmIiwiaXNzdWVyVXNlcklkIjoiYTZmZmRkOTktNDM2OC00NTgwLTgxOWYtZTZjZjYxM2U1M2M1In0seyJpc3N1ZXIiOiIyZGQ0YjE0NS0zYmRhLTQ2NjktYWU2YS0zN2I4Y2I2ZGFmN2YiLCJpc3N1ZXJVc2VySWQiOiJhNmZmZGQ5OS00MzY4LTQ1ODAtODE5Zi1lNmNmNjEzZTUzYzUifV0sImlzc3VlclVzZXJJZCI6IkFBRDozMTIzZWQ2Yy1jNzM4LTRiOTktOTAwZC0xNDE1ZTUzNjA2Y2UiLCJjbGllbnRJZCI6ImEzYTUxODZiLTdjODktNGI0Yy05M2E4LWRkNjA0ZTkzMDc1NyIsInJlbWVtYmVyTWUiOiJGYWxzZSIsInNlbmRNZUVtYWlsIjoib2ZmIiwiaXBBZGRyZXNzIjoiOTYuMjQxLjIxMi4xMjUiLCJDb3JyZWxhdGlvbklkIjoiYWUyYTMxYjktMjBkNC00MTBkLWE1ZjAtNDJhMWIzM2VmZmQ1In0.gmhhNsgFUbd0QAR1Z_isFjglQxZrM0Kj8yv5-w-FrsWM3d9PB6kWsldBndy6cEhwZh588T1u4vgG9A-XR3HZ4t-JnPZhpr8_7-lI4W4Tp4IAA0tIgMt7cHZUN14qstx_K72QLOrKbO34PQJKBymw2qKvwvhUo372MNFtc2D8_wS_VbG8QdOPumgsBJPqyF7HExt-gpkAu_5kL-54pqLSIZIJZ_viymti9ajla_B8PlvHMO7ZDWSgoV177ArcQAeOhv9MT1e5k0a4V7R-cCI77NIhoBUjV8C4lMAd27nntWzJJ9N00hEEGQb3zPoWUgRFAOdGzjg4xZu1D87C3MJtdA +Content-Type: application/json-patch+json +costco-x-wcs-clientId: 4900eb1f-0c10-4bd9-99c3-c59e6c1ecebf +client-identifier: 481b1aec-aa3b-454b-b81b-48187e28f205 +Content-Length: 808 +Origin: https://www.costco.com +DNT: 1 +Sec-GPC: 1 +Connection: keep-alive +Referer: https://www.costco.com/ +Sec-Fetch-Dest: empty +Sec-Fetch-Mode: cors +Sec-Fetch-Site: same-site + +*** Request +Request +{"query":"query receiptsWithCounts($startDate: String!, $endDate: String!,$documentType:String!,$documentSubType:String!) {\n receiptsWithCounts(startDate: $startDate, endDate: $endDate,documentType:$documentType,documentSubType:$documentSubType) {\n inWarehouse\n gasStation\n carWash\n gasAndCarWash\n receipts{\n warehouseName receiptType documentType transactionDateTime transactionBarcode warehouseName transactionType total \n totalItemCount\n itemArray { \n itemNumber\n }\n tenderArray { \n tenderTypeCode\n tenderDescription\n amountTender\n }\n couponArray { \n upcnumberCoupon\n } \n }\n}\n }","variables":{"startDate":"1/01/2026","endDate":"3/31/2026","text":"Last 3 Months","documentType":"all","documentSubType":"all"}} + +*** Response +{"data":{"receiptsWithCounts":{"inWarehouse":2,"gasStation":0,"carWash":0,"gasAndCarWash":0,"receipts":[{"warehouseName":"MT VERNON","receiptType":"In-Warehouse","documentType":"WarehouseReceiptDetail","transactionDateTime":"2026-03-12T16:16:00","transactionBarcode":"21111500804012603121616","transactionType":"Sales","total":208.58,"totalItemCount":24,"itemArray":[{"itemNumber":"34779"},{"itemNumber":"7950"},{"itemNumber":"2005"},{"itemNumber":"1941976"},{"itemNumber":"4873222"},{"itemNumber":"374664"},{"itemNumber":"60357"},{"itemNumber":"30669"},{"itemNumber":"1025795"},{"itemNumber":"787876"},{"itemNumber":"22093"},{"itemNumber":"1956177"},{"itemNumber":"1136340"},{"itemNumber":"7609681"},{"itemNumber":"18001"},{"itemNumber":"27003"},{"itemNumber":"1886266"},{"itemNumber":"4102"},{"itemNumber":"87745"},{"itemNumber":"110784"},{"itemNumber":"47492"},{"itemNumber":"2287780"},{"itemNumber":"917546"},{"itemNumber":"1768123"},{"itemNumber":"374558"}],"tenderArray":[{"tenderTypeCode":"061","tenderDescription":"VISA","amountTender":208.58}],"couponArray":[{"upcnumberCoupon":"2100003746641"},{"upcnumberCoupon":"2100003745583"}]},{"warehouseName":"MT VERNON","receiptType":"In-Warehouse","documentType":"WarehouseReceiptDetail","transactionDateTime":"2026-02-14T16:25:00","transactionBarcode":"21111500503322602141625","transactionType":"Sales","total":188.12,"totalItemCount":23,"itemArray":[{"itemNumber":"7812"},{"itemNumber":"7950"},{"itemNumber":"3923"},{"itemNumber":"19813"},{"itemNumber":"87745"},{"itemNumber":"1116038"},{"itemNumber":"5938"},{"itemNumber":"1136340"},{"itemNumber":"30669"},{"itemNumber":"384962"},{"itemNumber":"1331732"},{"itemNumber":"787876"},{"itemNumber":"61576"},{"itemNumber":"110784"},{"itemNumber":"180973"},{"itemNumber":"3"},{"itemNumber":"744361"},{"itemNumber":"1886266"},{"itemNumber":"1025795"},{"itemNumber":"11545"},{"itemNumber":"47492"},{"itemNumber":"260509"}],"tenderArray":[{"tenderTypeCode":"061","tenderDescription":"VISA","amountTender":188.12}],"couponArray":[]}]}}} +** item +*** POST + https://ecom-api.costco.com/ebusiness/order/v1/orders/graphql + +*** headers + +POST /ebusiness/order/v1/orders/graphql HTTP/2 +Host: ecom-api.costco.com +User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:148.0) Gecko/20100101 Firefox/148.0 +Accept: */* +Accept-Language: en-US,en;q=0.9 +Accept-Encoding: gzip, deflate, br, zstd +costco.service: restOrders +costco.env: ecom +costco-x-authorization: Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IlhrZTFoNXg5TV9ZMk5ER0YxU1hDX2xNNnVSTU5tZTJ3STBLRDlHNzl1QmciLCJ0eXAiOiJKV1QifQ.eyJleHAiOjE3NzM2NjUzODUsIm5iZiI6MTc3MzY2NDQ4NSwidmVyIjoiMS4wIiwiaXNzIjoiaHR0cHM6Ly9zaWduaW4uY29zdGNvLmNvbS9lMDcxNGRkNC03ODRkLTQ2ZDYtYTI3OC0zZTI5NTUzNDgzZWIvdjIuMC8iLCJzdWIiOiIzMTIzZWQ2Yy1jNzM4LTRiOTktOTAwZC0xNDE1ZTUzNjA2Y2UiLCJhdWQiOiJhM2E1MTg2Yi03Yzg5LTRiNGMtOTNhOC1kZDYwNGU5MzA3NTciLCJhY3IiOiJCMkNfMUFfU1NPX1dDU19zaWdudXBfc2lnbmluXzIwMSIsIm5vbmNlIjoiNzg5MjIzOGUtOWU3NC00MzExLWI2NDItMzQ1NTY4ZDY3NTk4IiwiaWF0IjoxNzczNjY0NDg1LCJhdXRoX3RpbWUiOjE3NzM2NjQ0ODQsImF1dGhlbnRpY2F0aW9uU291cmNlIjoibG9jYWxBY2NvdW50QXV0aGVudGljYXRpb24iLCJlbWFpbCI6ImpvaG5tb3Nlc2NhcnRlckBnbWFpbC5jb20iLCJuYW1lIjoiRW1wdHkgRGlzcGxheW5hbWUiLCJ1c2VySWRlbnRpdGllcyI6W3siaXNzdWVyIjoiYTNhNTE4NmItN2M4OS00YjRjLTkzYTgtZGQ2MDRlOTMwNzU3IiwiaXNzdWVyVXNlcklkIjoiQUFEOjMxMjNlZDZjLWM3MzgtNGI5OS05MDBkLTE0MTVlNTM2MDZjZSJ9LHsiaXNzdWVyIjoiNDkwMGViMWYtMGMxMC00YmQ5LTk5YzMtYzU5ZTZjMWVjZWJmIiwiaXNzdWVyVXNlcklkIjoiYTZmZmRkOTktNDM2OC00NTgwLTgxOWYtZTZjZjYxM2U1M2M1In0seyJpc3N1ZXIiOiIyZGQ0YjE0NS0zYmRhLTQ2NjktYWU2YS0zN2I4Y2I2ZGFmN2YiLCJpc3N1ZXJVc2VySWQiOiJhNmZmZGQ5OS00MzY4LTQ1ODAtODE5Zi1lNmNmNjEzZTUzYzUifV0sImlzc3VlclVzZXJJZCI6IkFBRDozMTIzZWQ2Yy1jNzM4LTRiOTktOTAwZC0xNDE1ZTUzNjA2Y2UiLCJjbGllbnRJZCI6ImEzYTUxODZiLTdjODktNGI0Yy05M2E4LWRkNjA0ZTkzMDc1NyIsInJlbWVtYmVyTWUiOiJGYWxzZSIsInNlbmRNZUVtYWlsIjoib2ZmIiwiaXBBZGRyZXNzIjoiOTYuMjQxLjIxMi4xMjUiLCJDb3JyZWxhdGlvbklkIjoiMDk0YTE5NDYtZTMwNS00ZDkzLWEyMzQtM2ZiNGMwMjMyNDhhIn0.FdsVFHsewvpQABvkEz4uA0NUlYwvlBEg-frJbUDIJRTsP59Be0bOt8Zqv6cZhUqBn_lTQEyi9tnvpkpycmNy7Rg5zLfYroH6mNALRqkBm8VbcmrEVDM1HmdNTHgO9vQD4TdKm1ZYkA7Pj_6QY3sDxI4ioOzIz1_XOnoJVAXjEwGfr8hgvqtlaC51M5DsfIGQj3zCaJrQnD6GBJlFmLNUpCulpT16WAaB1lT_pcycfBs-e1xnEd33dX0kHBOZ8pFS-IKjV_44ZK9R8jI9WHx5ThX3-DtyqjkJ0JypmhT9uEa0MeT55U7aeKPbMvQ0exiw3culKgiWDhvdp8e2EkExsg +Content-Type: application/json-patch+json +costco-x-wcs-clientId: 4900eb1f-0c10-4bd9-99c3-c59e6c1ecebf +client-identifier: 481b1aec-aa3b-454b-b81b-48187e28f205 +Content-Length: 2916 +Origin: https://www.costco.com +DNT: 1 +Sec-GPC: 1 +Connection: keep-alive +Referer: https://www.costco.com/ +Sec-Fetch-Dest: empty +Sec-Fetch-Mode: cors +Sec-Fetch-Site: same-site +Priority: u=0 +TE: trailers + +*** request +{"query":"query receiptsWithCounts($barcode: String!,$documentType:String!) {\n receiptsWithCounts(barcode: $barcode,documentType:$documentType) {\nreceipts{\n warehouseName\n receiptType \n documentType \n transactionDateTime \n transactionDate \n companyNumber \n warehouseNumber \n operatorNumber \n warehouseName \n warehouseShortName \n registerNumber \n transactionNumber \n transactionType\n transactionBarcode \n total \n warehouseAddress1 \n warehouseAddress2 \n warehouseCity \n warehouseState \n warehouseCountry \n warehousePostalCode\n totalItemCount \n subTotal \n taxes\n total \n invoiceNumber\n sequenceNumber\n itemArray { \n itemNumber \n itemDescription01 \n frenchItemDescription1 \n itemDescription02 \n frenchItemDescription2 \n itemIdentifier \n itemDepartmentNumber\n unit \n amount \n taxFlag \n merchantID \n entryMethod\n transDepartmentNumber\n fuelUnitQuantity\n fuelGradeCode\n fuelUnitQuantity\n itemUnitPriceAmount\n fuelUomCode\n fuelUomDescription\n fuelUomDescriptionFr\n fuelGradeDescription\n fuelGradeDescriptionFr\n\n } \n tenderArray { \n tenderTypeCode\n tenderSubTypeCode\n tenderDescription \n amountTender \n displayAccountNumber \n sequenceNumber \n approvalNumber \n responseCode \n tenderTypeName \n transactionID \n merchantID \n entryMethod\n tenderAcctTxnNumber \n tenderAuthorizationCode \n tenderTypeName\n tenderTypeNameFr\n tenderEntryMethodDescription\n walletType\n walletId\n storedValueBucket\n } \n subTaxes { \n tax1 \n tax2 \n tax3 \n tax4 \n aTaxPercent \n aTaxLegend \n aTaxAmount\n aTaxPrintCode\n aTaxPrintCodeFR \n aTaxIdentifierCode \n bTaxPercent \n bTaxLegend \n bTaxAmount\n bTaxPrintCode\n bTaxPrintCodeFR \n bTaxIdentifierCode \n cTaxPercent \n cTaxLegend \n cTaxAmount\n cTaxIdentifierCode \n dTaxPercent \n dTaxLegend \n dTaxAmount\n dTaxPrintCode\n dTaxPrintCodeFR \n dTaxIdentifierCode\n uTaxLegend\n uTaxAmount\n uTaxableAmount\n } \n instantSavings \n membershipNumber \n }\n }\n }","variables":{"barcode":"21111500804012603121616","documentType":"warehouse"}} + +*** response +{"data":{"receiptsWithCounts":{"receipts":[{"warehouseName":"MT VERNON","receiptType":"In-Warehouse","documentType":"WarehouseReceiptDetail","transactionDateTime":"2026-03-12T16:16:00","transactionDate":"2026-03-12","companyNumber":1,"warehouseNumber":1115,"operatorNumber":43,"warehouseShortName":"MT VERNON","registerNumber":8,"transactionNumber":401,"transactionType":"Sales","transactionBarcode":"21111500804012603121616","total":208.58,"warehouseAddress1":"7940 RICHMOND HWY","warehouseAddress2":null,"warehouseCity":"ALEXANDRIA","warehouseState":"VA","warehouseCountry":"US","warehousePostalCode":"22306","totalItemCount":24,"subTotal":202.01,"taxes":6.57,"invoiceNumber":null,"sequenceNumber":null,"itemArray":[{"itemNumber":"34779","itemDescription01":"ROMANO","frenchItemDescription1":null,"itemDescription02":"CS=15 SL120 T9H6","frenchItemDescription2":null,"itemIdentifier":"E","itemDepartmentNumber":19,"unit":1,"amount":20.93,"taxFlag":"3","merchantID":null,"entryMethod":null,"transDepartmentNumber":19,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":11.69,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"7950","itemDescription01":"4LB COSMIC","frenchItemDescription1":null,"itemDescription02":null,"frenchItemDescription2":null,"itemIdentifier":"E","itemDepartmentNumber":65,"unit":1,"amount":5.99,"taxFlag":"3","merchantID":null,"entryMethod":null,"transDepartmentNumber":65,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":5.99,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"2005","itemDescription01":"25# FLOUR","frenchItemDescription1":null,"itemDescription02":"ALL-PURPOSE HARV P98/100","frenchItemDescription2":null,"itemIdentifier":"E","itemDepartmentNumber":13,"unit":1,"amount":9.49,"taxFlag":"3","merchantID":null,"entryMethod":null,"transDepartmentNumber":13,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":9.49,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"1941976","itemDescription01":"BREAD FLOUR","frenchItemDescription1":null,"itemDescription02":"12 LBS 180P 20X9","frenchItemDescription2":null,"itemIdentifier":"E","itemDepartmentNumber":13,"unit":1,"amount":9.99,"taxFlag":"3","merchantID":null,"entryMethod":null,"transDepartmentNumber":13,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":9.99,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"4873222","itemDescription01":"ALL F&C","frenchItemDescription1":null,"itemDescription02":"200OZ 160LOADS P104","frenchItemDescription2":null,"itemIdentifier":null,"itemDepartmentNumber":14,"unit":1,"amount":19.99,"taxFlag":"Y","merchantID":null,"entryMethod":null,"transDepartmentNumber":14,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":19.99,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"374664","itemDescription01":"/ 4873222","frenchItemDescription1":"/4873222","itemDescription02":null,"frenchItemDescription2":null,"itemIdentifier":null,"itemDepartmentNumber":14,"unit":-1,"amount":-5,"taxFlag":null,"merchantID":null,"entryMethod":null,"transDepartmentNumber":14,"fuelUnitQuantity":null,"fuelGradeCode":null,"itemUnitPriceAmount":0,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"60357","itemDescription01":"MIXED PEPPER","frenchItemDescription1":null,"itemDescription02":"6-PACK","frenchItemDescription2":null,"itemIdentifier":"E","itemDepartmentNumber":65,"unit":1,"amount":7.49,"taxFlag":"3","merchantID":null,"entryMethod":null,"transDepartmentNumber":65,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":7.49,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"30669","itemDescription01":"BANANAS","frenchItemDescription1":null,"itemDescription02":"3 LB / 1.36 KG","frenchItemDescription2":null,"itemIdentifier":"E","itemDepartmentNumber":65,"unit":2,"amount":2.98,"taxFlag":"3","merchantID":null,"entryMethod":null,"transDepartmentNumber":65,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":1.49,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"1025795","itemDescription01":"KS 5DZ EGGS","frenchItemDescription1":null,"itemDescription02":"SL21 P120 / P132 / P144","frenchItemDescription2":null,"itemIdentifier":"E","itemDepartmentNumber":17,"unit":1,"amount":9.39,"taxFlag":"3","merchantID":null,"entryMethod":null,"transDepartmentNumber":17,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":9.39,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"787876","itemDescription01":"KS TWNY PORT","frenchItemDescription1":null,"itemDescription02":"PORTUGAL CSPC# 773506","frenchItemDescription2":null,"itemIdentifier":null,"itemDepartmentNumber":16,"unit":1,"amount":17.99,"taxFlag":"Y","merchantID":null,"entryMethod":null,"transDepartmentNumber":16,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":17.99,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"22093","itemDescription01":"KS SHRP CHDR","frenchItemDescription1":null,"itemDescription02":"EC20T9H5 W12T13H5 SL130","frenchItemDescription2":null,"itemIdentifier":"E","itemDepartmentNumber":17,"unit":1,"amount":5.49,"taxFlag":"3","merchantID":null,"entryMethod":null,"transDepartmentNumber":17,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":5.49,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"1956177","itemDescription01":"BRWNBTTRGRV","frenchItemDescription1":null,"itemDescription02":"MCCORMICK C12T19H7 L228","frenchItemDescription2":null,"itemIdentifier":"E","itemDepartmentNumber":13,"unit":1,"amount":2.97,"taxFlag":"3","merchantID":null,"entryMethod":null,"transDepartmentNumber":13,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":2.97,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"1136340","itemDescription01":"3LB ORG GALA","frenchItemDescription1":null,"itemDescription02":null,"frenchItemDescription2":null,"itemIdentifier":"E","itemDepartmentNumber":65,"unit":1,"amount":4.49,"taxFlag":"3","merchantID":null,"entryMethod":null,"transDepartmentNumber":65,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":4.49,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"7609681","itemDescription01":"CASCADE GEL","frenchItemDescription1":null,"itemDescription02":"125OZ T60H3P180","frenchItemDescription2":null,"itemIdentifier":null,"itemDepartmentNumber":14,"unit":1,"amount":12.49,"taxFlag":"Y","merchantID":null,"entryMethod":null,"transDepartmentNumber":14,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":12.49,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"18001","itemDescription01":"TBLE SALT 4#","frenchItemDescription1":null,"itemDescription02":"DIAMOND CRYSTAL P=600","frenchItemDescription2":null,"itemIdentifier":"E","itemDepartmentNumber":13,"unit":1,"amount":1.49,"taxFlag":"3","merchantID":null,"entryMethod":null,"transDepartmentNumber":13,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":1.49,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"27003","itemDescription01":"STRAWBERRIES","frenchItemDescription1":null,"itemDescription02":"908 G / 2 LB","frenchItemDescription2":null,"itemIdentifier":"E","itemDepartmentNumber":65,"unit":1,"amount":5.29,"taxFlag":"3","merchantID":null,"entryMethod":null,"transDepartmentNumber":65,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":5.29,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"1886266","itemDescription01":"SKO 5%","frenchItemDescription1":null,"itemDescription02":"48 OZ T10H8 SL30","frenchItemDescription2":null,"itemIdentifier":"E","itemDepartmentNumber":17,"unit":1,"amount":5.79,"taxFlag":"3","merchantID":null,"entryMethod":null,"transDepartmentNumber":17,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":5.79,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"4102","itemDescription01":"8\" TORTILLAS","frenchItemDescription1":null,"itemDescription02":"SL10 70OZ","frenchItemDescription2":null,"itemIdentifier":"E","itemDepartmentNumber":13,"unit":1,"amount":5.99,"taxFlag":"3","merchantID":null,"entryMethod":null,"transDepartmentNumber":13,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":5.99,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"87745","itemDescription01":"ROTISSERIE","frenchItemDescription1":null,"itemDescription02":"USDA GRADE A","frenchItemDescription2":null,"itemIdentifier":null,"itemDepartmentNumber":63,"unit":1,"amount":4.99,"taxFlag":"D","merchantID":null,"entryMethod":null,"transDepartmentNumber":63,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":4.99,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"110784","itemDescription01":"15 GRAIN BRD","frenchItemDescription1":null,"itemDescription02":"PEPPERIDGE FARM 2/24 OZ","frenchItemDescription2":null,"itemIdentifier":"E","itemDepartmentNumber":13,"unit":1,"amount":5.69,"taxFlag":"3","merchantID":null,"entryMethod":null,"transDepartmentNumber":13,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":5.69,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"47492","itemDescription01":"CELERY SALAD","frenchItemDescription1":null,"itemDescription02":"APPLE CIDER VINAIGRETTE","frenchItemDescription2":null,"itemIdentifier":"E","itemDepartmentNumber":63,"unit":1,"amount":12.62,"taxFlag":"D","merchantID":null,"entryMethod":null,"transDepartmentNumber":63,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":4.99,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"2287780","itemDescription01":"BTB CHICKEN","frenchItemDescription1":null,"itemDescription02":"C12T10H9 P1080 SL630","frenchItemDescription2":null,"itemIdentifier":"E","itemDepartmentNumber":13,"unit":1,"amount":9.49,"taxFlag":"3","merchantID":null,"entryMethod":null,"transDepartmentNumber":13,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":9.49,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"917546","itemDescription01":"JIF CREAMY","frenchItemDescription1":null,"itemDescription02":"PEANUT BUTTER SL540 P300","frenchItemDescription2":null,"itemIdentifier":"E","itemDepartmentNumber":13,"unit":1,"amount":11.99,"taxFlag":"3","merchantID":null,"entryMethod":null,"transDepartmentNumber":13,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":11.99,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"1768123","itemDescription01":"BBEE KIDS4PC","frenchItemDescription1":null,"itemDescription02":"FY26 P1600 T200 H8","frenchItemDescription2":null,"itemIdentifier":null,"itemDepartmentNumber":39,"unit":1,"amount":17.99,"taxFlag":"Y","merchantID":null,"entryMethod":null,"transDepartmentNumber":39,"fuelUnitQuantity":10.0,"fuelGradeCode":null,"itemUnitPriceAmount":17.99,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null},{"itemNumber":"374558","itemDescription01":"/ 1768123","frenchItemDescription1":"/1768123","itemDescription02":null,"frenchItemDescription2":null,"itemIdentifier":null,"itemDepartmentNumber":39,"unit":-1,"amount":-4,"taxFlag":null,"merchantID":null,"entryMethod":null,"transDepartmentNumber":39,"fuelUnitQuantity":null,"fuelGradeCode":null,"itemUnitPriceAmount":0,"fuelUomCode":null,"fuelUomDescription":null,"fuelUomDescriptionFr":null,"fuelGradeDescription":null,"fuelGradeDescriptionFr":null}],"tenderArray":[{"tenderTypeCode":"061","tenderSubTypeCode":null,"tenderDescription":"VISA","amountTender":208.58,"displayAccountNumber":"9070","sequenceNumber":null,"approvalNumber":null,"responseCode":null,"tenderTypeName":"VISA","transactionID":null,"merchantID":null,"entryMethod":null,"tenderAcctTxnNumber":null,"tenderAuthorizationCode":null,"tenderTypeNameFr":null,"tenderEntryMethodDescription":null,"walletType":null,"walletId":null,"storedValueBucket":null}],"subTaxes":{"tax1":null,"tax2":null,"tax3":null,"tax4":null,"aTaxPercent":null,"aTaxLegend":"A","aTaxAmount":4.62,"aTaxPrintCode":null,"aTaxPrintCodeFR":null,"aTaxIdentifierCode":null,"bTaxPercent":null,"bTaxLegend":null,"bTaxAmount":null,"bTaxPrintCode":null,"bTaxPrintCodeFR":null,"bTaxIdentifierCode":null,"cTaxPercent":null,"cTaxLegend":"C","cTaxAmount":1.25,"cTaxIdentifierCode":null,"dTaxPercent":null,"dTaxLegend":"D","dTaxAmount":0.7,"dTaxPrintCode":null,"dTaxPrintCodeFR":null,"dTaxIdentifierCode":null,"uTaxLegend":null,"uTaxAmount":null,"uTaxableAmount":null},"instantSavings":9,"membershipNumber":"111894291684"}]}}} + diff --git a/pm/tasks.org b/pm/tasks.org index b8c9af0..7e90d32 100644 --- a/pm/tasks.org +++ b/pm/tasks.org @@ -147,35 +147,96 @@ ** acceptance criteria - add a costco-specific raw ingest/export path -- output costco line items into the same shared raw/enriched schema family -- confirm at least one product class can exist as: - - giant observed product - - costco observed product - - one shared canonical product +- fetch costco receipt summary and receipt detail payloads from graphql endpoint +- persist raw json under `costco_output/raw/orders.csv` and `./items.csv`, same format as giant +- costco-native identifiers such as `transactionBarcode` as order id and `itemNumber` as retailer item id +- preserve discount/coupon rows rather than dropping ** notes -- this is the proof that the architecture generalizes -- don’t chase perfection before the second retailer lands +- focus on raw costco acquisistion and flattening +- do not force costco identifiers into `upc` +- bearer/auth values should come from local env, not source ** evidence - commit: - tests: - date: -* [ ] t1.9: compute normalized comparison metrics (2-3 commits) +* [ ] t1.8.1: support costco parser/enricher path (2-4 commits) ** acceptance criteria -- derive normalized comparison fields where possible: - - price per lb - - price per oz - - price per each - - price per count -- metrics are attached at canonical or linked-observed level as appropriate -- emit obvious nulls when basis is unknown rather than inventing values +- add a costco-specific enrich step producing `costco_output/items_enriched.csv` +- output rows into the same shared enriched schema family as Giant +- support costco-specific parsing for: + - `itemDescription01` + `itemDescription02` + - `itemNumber` as `retailer_item_id` + - discount lines / negative rows + - common size patterns such as `25#`, `48 OZ`, `2/24 OZ`, `6-PACK` +- preserve obvious unknowns as blank rather than guessed values ** notes -- this is where “gala apples 5 lb bag vs other gala apples” becomes possible -- units discipline matters a lot here +- this is the real schema compatibility proof, not raw ingest alone +- expect weaker identifiers than Giant + +** evidence +- commit: +- tests: +- date: +* [ ] t1.8.2: validate cross-retailer observed/canonical flow (1-3 commits) + +** acceptance criteria +- feed Giant and Costco enriched rows through the same observed/canonical pipeline +- confirm at least one product class can exist as: + - Giant observed product + - Costco observed product + - one shared canonical product +- document the exact example used for proof + +** notes +- keep this to one or two well-behaved product classes first +- apples, eggs, bananas, or flour are better than weird prepared foods + +** evidence +- commit: +- tests: +- date: +* [ ] t1.8.3: extend shared schema for retailer-native ids and adjustment lines (1-2 commits) + +** acceptance criteria +- add shared fields needed for non-upc retailers, including: + - `retailer_item_id` + - `is_discount_line` + - `is_coupon_line` or equivalent if needed +- keep `upc` nullable across the pipeline +- update downstream builders/tests to accept retailers with blank `upc` + +** notes +- this prevents costco from becoming a schema hack +- do this once instead of sprinkling exceptions everywhere + +** evidence +- commit: +- tests: +- date: +* [ ] t1.9: compute normalized comparison metrics (2-4 commits) + +** acceptance criteria +- derive normalized comparison fields where possible on enriched or observed product rows: + - `price_per_lb` + - `price_per_oz` + - `price_per_each` + - `price_per_count` +- preserve the source basis used to derive each metric, e.g.: + - parsed size/unit + - receipt weight + - explicit count/pack +- emit nulls when basis is unknown, conflicting, or ambiguous +- document at least one Giant vs Costco comparison example using the normalized metrics + +** notes +- compute metrics as close to the raw observation as possible +- canonical layer can aggregate later, but should not invent missing unit economics +- unit discipline matters more than coverage ** evidence - commit: