Add terminal review resolution workflow
This commit is contained in:
73
pm/review-workflow.org
Normal file
73
pm/review-workflow.org
Normal file
@@ -0,0 +1,73 @@
|
||||
* review and item-resolution workflow
|
||||
|
||||
This document defines the durable review workflow for unresolved observed
|
||||
products.
|
||||
|
||||
** persistent files
|
||||
|
||||
- `combined_output/purchases.csv`
|
||||
Flat normalized purchase log. This is the review input because it retains:
|
||||
- raw item name
|
||||
- normalized item name
|
||||
- observed product id
|
||||
- canonical product id when resolved
|
||||
- retailer/order/date/price context
|
||||
- `combined_output/review_queue.csv`
|
||||
Current unresolved observed products grouped for review.
|
||||
- `combined_output/review_resolutions.csv`
|
||||
Durable mapping decisions from observed products to canonical products.
|
||||
- `combined_output/canonical_catalog.csv`
|
||||
Durable canonical item catalog used by manual review and later purchase-log
|
||||
rebuilds.
|
||||
|
||||
There is no separate alias file in v1. `review_resolutions.csv` is the mapping
|
||||
layer from observed products to canonical product ids.
|
||||
|
||||
** workflow
|
||||
|
||||
1. Run `build_purchases.py`
|
||||
This refreshes the purchase log and seeds/updates the canonical catalog from
|
||||
current auto-linked canonical rows.
|
||||
2. Run `review_products.py`
|
||||
This rebuilds `review_queue.csv` from unresolved purchase rows and prompts in
|
||||
the terminal for one observed product at a time.
|
||||
3. Choose one of:
|
||||
- link to existing canonical
|
||||
- create new canonical
|
||||
- exclude
|
||||
- skip
|
||||
4. `review_products.py` writes decisions immediately to:
|
||||
- `review_resolutions.csv`
|
||||
- `canonical_catalog.csv` when a new canonical item is created
|
||||
5. Rerun `build_purchases.py`
|
||||
This reapplies approved resolutions so the final normalized purchase log now
|
||||
carries the reviewed `canonical_product_id`.
|
||||
|
||||
** what the human edits
|
||||
|
||||
The primary interface is terminal prompts in `review_products.py`.
|
||||
|
||||
The human provides:
|
||||
- existing canonical id when linking
|
||||
- canonical name/category/product type when creating a new canonical item
|
||||
- optional resolution notes
|
||||
|
||||
The generated CSVs remain editable by hand if needed, but the intended workflow
|
||||
is terminal-first.
|
||||
|
||||
** durability
|
||||
|
||||
- Resolutions are keyed by `observed_product_id`, not by one-off text
|
||||
substitution.
|
||||
- Canonical products are keyed by stable `canonical_product_id`.
|
||||
- Future runs reuse approved mappings through `review_resolutions.csv`.
|
||||
|
||||
** retention of audit fields
|
||||
|
||||
The final `purchases.csv` retains:
|
||||
- `raw_item_name`
|
||||
- `normalized_item_name`
|
||||
- `canonical_product_id`
|
||||
|
||||
This preserves the raw receipt description, the deterministic parser output, and
|
||||
the human-approved canonical identity in one flat purchase log.
|
||||
Reference in New Issue
Block a user