openai batch refactor

2026-05-06 13:53:50 -04:00
parent 946aeac7c8
commit 64a7a18721
5 changed files with 833 additions and 312 deletions
--- a/docs/tasks.org
+++ b/docs/tasks.org
@@ -158,7 +158,7 @@ forum_id_input,comment_id,title,text,date,author,stance,stance_confidence,stance
 - tests: 23 passing (pytest tests/analysis_gpt4o_batch.py), 51 total across suite
 - datetime: [2026-05-06 Wed 08:55]

-* [ ] t1.2.3: batch job refactor
+* [X] t1.2.3: batch job refactor
 This task encompasses intent and fixes for 1.2.1 and 1.2.2.
 batch processing should  be a resumable job queue, not a one-shot script. the user should not need to remember offsets, completed chunks, failed batches, or which comments remain.
 ** Acceptance Criteria
@@ -200,6 +200,46 @@ batch processing should  be a resumable job queue, not a one-shot script. the us
   - resume from status.json
   - remaining-comment detection

+** notes
+- analysis/gpt4o/tokenizer.py: new standalone script; imports analysis_batch for MODEL_LIMITS, estimate_tokens, build_messages. Reads input JSONL + prompt, computes per-model jobs/cost/time table, writes report.json to input file's directory. MODEL_PRICING dict lives here (not in analysis_batch).
+- analysis/gpt4o/analysis_batch.py: fully rewritten with four subcommands: create, submit, status, download. No longer uses REQUESTS_DIR / RAW_DIR / RUNS_DIR.
+- Job directories: analysis/gpt4o/jobs/<stem[:8]>-N/ (e.g. f452-1). Each run is self-contained: forum.jsonl, prompt.txt, report.json, jobN-input.jsonl, jobN-output-raw.jsonl, jobN-output.jsonl, jobN-errors.jsonl.
+- status.json: tracks all jobs with pending/submitted/in_progress/completed/failed states. Updated by submit, status, download.
+- _find_next_eligible_job: pure function for testability. Returns (next_pending_job, None) or (None, warning). Blocks submission if previous job is in_progress/submitted.
+- create: no API key required. Reads report.json, re-chunks comments, writes all jobN-input.jsonl files, writes status.json.
+- submit: uploads jobN-input.jsonl to Files API, creates batch, updates status.json to 'submitted'. Will not stack batches.
+- status: retrieves batch from OpenAI, updates status.json counts and status.
+- download: auto-runs status first, downloads output_file_id → jobN-output-raw.jsonl, error_file_id → jobN-errors.jsonl, normalizes → jobN-output.jsonl. Updates status.json.
+- tests/test_tokenizer.py: 15 tests for compute_report schema, cost/time calculation, MODEL_PRICING coverage, print_table output, report.json round-trip.
+
+*** usage
+#+begin_src sh
+# 1. estimate tokens and cost
+python analysis/gpt4o/tokenizer.py output/f452.jsonl --prompt analysis/prompt-1.txt
+# writes output/report.json
+
+# 2. create job directory (no api key needed)
+python analysis/gpt4o/analysis_batch.py create output/report.json --model gpt-4o-mini
+# creates analysis/gpt4o/jobs/f452-1/
+
+# 3. submit first job
+python analysis/gpt4o/analysis_batch.py submit
+
+# 4. check status (repeat until completed)
+python analysis/gpt4o/analysis_batch.py status
+
+# 5. download and normalize
+python analysis/gpt4o/analysis_batch.py download
+
+# 6. submit next job (if multi-job run), then repeat 4-5
+python analysis/gpt4o/analysis_batch.py submit
+#+end_src
+
+** evidence
+- commit:
+- tests: passing (pytest tests/analysis_gpt4o_batch.py tests/test_tokenizer.py)
+- datetime: [2026-05-05 Tue]
+
 * === Backlog ===
 * [ ] X: analysis validation view
 create a lightweight validation script that joins raw comments to normalized analysis output and writes a human-reviewable csv.