From 6336c15da86d77868aa4dbfc87ebbc9f9a466106 Mon Sep 17 00:00:00 2001 From: ben Date: Tue, 24 Mar 2026 17:10:09 -0400 Subject: [PATCH] Record t1.22 task evidence --- pm/tasks.org | 37 +++++++++++++++++++++++++++++++++++-- 1 file changed, 35 insertions(+), 2 deletions(-) diff --git a/pm/tasks.org b/pm/tasks.org index deb7fa2..f616620 100644 --- a/pm/tasks.org +++ b/pm/tasks.org @@ -1048,7 +1048,7 @@ ensure purchases retains enough visit/order context to support spend-by-visit an ** notes - The needed visit fields were already flowing through `build_purchases.py`; this task locked them in with explicit tests and documentation instead of adding a new visit layer. -- `data/review/purchases.csv` is now documented as the primary analysis artifact for both item-level and visit-level work. +- `data/analysis/purchases.csv` is now documented as the primary analysis artifact for both item-level and visit-level work. * [X] t1.21: add lightweight charting/analysis surface on top of purchases.csv (2-4 commits) build a minimal analysis layer for common price and visit charts without changing the csv pipeline @@ -1070,9 +1070,42 @@ build a minimal analysis layer for common price and visit charts without changin - datetime: 2026-03-24 16:48:41 EDT ** notes -- The new layer is file-based, not notebook- or dashboard-based: `analyze_purchases.py` reads `data/review/purchases.csv` and writes chart-ready CSVs under `data/review/analysis/`. +- The new layer is file-based, not notebook- or dashboard-based: `analyze_purchases.py` reads `data/analysis/purchases.csv` and writes chart-ready CSVs under `data/analysis/`. - This keeps Excel/pivot workflows intact while still giving a repeatable CLI path for common price, visit, category, and retailer/store summaries. +* [X] t1.22: cleanup and finalize post-refactor merging refactor/enrich into cx (3-6 commits) +remove transitional detritus from the repo and make the final folder/script layout explicit before merging back into `cx` + +** acceptance criteria +1. move `catalog.csv` alongside the other step-3 review artifacts under `data/review/` + - update active scripts, tests, docs, and task notes to match the chosen path +2. promote analysis to a top-level step-4 folder such as `data/analysis/` + - add `purchases.csv` to this folder + - update active scripts, tests, docs, and task notes to match the chosen path +3. remove obsolete or superseded Python files + - includes old `scrape_*`, `enrich_*`, `build_*`, and proof/check scripts as appropriate + - do not remove files still required by the active collect/normalize/review/analysis pipeline +4. active repo entrypoints are reduced to the intended flow and are easy to identify, including: + - retailer collection + - retailer normalization + - review/combine + - status/reporting + - analysis +5. tests pass after removals and path decisions +6. README reflects the final post-refactor structure and run order without legacy ambiguity +7. `pm/data-model.org` and `pm/tasks.org` reflect the final chosen layout +- pm note: prefer deleting true detritus over keeping compatibility shims now that the refactor path is established +- pm note: make folder decisions once here so we stop carrying path churn into later tasks + +** evidence +- commit: `09829b2` `Finalize post-refactor layout and remove old pipeline files` +- tests: `./venv/bin/python -m unittest discover -s tests`; `./venv/bin/python build_purchases.py`; `./venv/bin/python review_products.py --refresh-only`; `./venv/bin/python report_pipeline_status.py`; `./venv/bin/python analyze_purchases.py`; `./venv/bin/python collect_giant_web.py --help`; `./venv/bin/python collect_costco_web.py --help`; `./venv/bin/python normalize_giant_web.py --help`; `./venv/bin/python normalize_costco_web.py --help` +- datetime: 2026-03-24 17:09:45 EDT + +** notes +- Final layout decision: `catalog.csv` now lives under `data/review/`, while `purchases.csv` and the chart-ready analysis outputs live under the step-4 `data/analysis/` folder. +- Removed obsolete top-level pipeline files and their dead tests so the active entrypoints are now the collect, normalize, review/combine, status, and analysis scripts only. + * [ ] t1.10: add optional llm-assisted suggestion workflow for unresolved normalized retailer items (2-4 commits) ** acceptance criteria