diff --git a/README.md b/README.md
index 862871a..7767f69 100644
--- a/README.md
+++ b/README.md
@@ -1,21 +1,20 @@
# Table of Contents
-1. [Project Goals](#org5acb669)
- 1. [Document and analyze sentiment](#org9291576)
- 2. [Make data available](#org8054421)
- 3. [Generalize](#orgdda4b6f)
-2. [Architecture](#org1d6bc40)
- 1. [Scraper](#org4298028)
- 2. [Storage](#org1cd413c)
- 3. [Analysis](#orgaea450e)
-3. [Roadmap](#org6b7660d)
+ 1. [Project Goals](#orgf37a106)
+ 1. [Research questions](#orgec50d46)
+ 2. [Architecture](#org7a5389e)
+ 1. [Scraper](#org7771df2)
+ 2. [Analysis](#org16a9e36)
+ 3. [Storage](#org7341391)
+ 3. [Instructions](#org692b2f6)
+1. [Roadmap](#org9f21934)
-
+
-# Project Goals
+## Project Goals
1. Document and analyze sentiment of public comments on Virginia law, to determine:
1. the utility of this forum as a mechanism for public comment, and
@@ -24,130 +23,128 @@
3. Generalize to other public comment tools.
-
+
-## Document and analyze sentiment
+### Research questions
-- Scrape the data, parse, clean, and store. Clearly separate scraper from sentiment analyzer for maximum auditability.
-- Build tests for identifying abuse, such as spam and account fraud
-- Identify any patterns connecting measured sentiment against VA decisions
+1. What is the quality of the comments on the forum?
+ 1. Are there duplicate entries?
+ 2. Are there non-human-generated entries?
+ 3. Are there entries intended to abuse the forum or drown out comment?
+2. How do commenters feel about the proposed change?
+ 1. What is the total number and percent supporting vs opposing, and how does this change over time?
+ 2. What is the type of support, such as strong/weak, positive/negative?
+3. What impact do the comments have on the proposed change?
+ (I anticipate this will not be measurable from currently available data)
-
+
-## Make data available
+## Architecture
-- Pick a good visualization tool
+1. Scrape/Parse: Scrapy
+2. Sentiment analysis: gpt-5.4-mini
+3. Display: streamlit
+4. Storage: jsonl, csv, parquet
-
+
-## Generalize
+### Scraper
-- Identify scalable ways to apply this toolset to similar problems
-
-
-
-
-# Architecture
-
-1. Scrape/Parse: ****Scrapy**** for downloading comments
-2. Storage: json
-3. Sentiment analysis: Claude haiku
-4. Display: TBD
-
-
-
-
-## Scraper
-
-Scrapy provides a simple mechanism for browsing and
+Scrapy provides a simple mechanism for retrieving, parsing, and saving content form the forums.
1. Forums listing page: \`Forums.cfm\` - lists all open forums with agency, reg title, action type, brief description, closing date, comment count
2. Comment listing page: \`comments.cfm?GDocForumID=X\` or \`comments.cfm?stageid=X\` or \`comments.cfm?petitionid=X\` - lists comments with title, author, date
3. Individual comment page: \`viewcomments.cfm?commentid=X\` - shows regulation title + brief description at the top, plus the comment
-
+
-## Storage
+### Analysis
-One JSONL file per forum/bill.
+Google and Amazon both return generic sentiment (tone of writing: positive/negative), not stance (for/against the regulation): "I strongly believe the government should NOT interfere" is negative tone but "against" the regulation. We add the proposed change as context to the model.
+
+Before sending the comments for sentiment analysis, \`tokenizer.py\` receives the forum to be processed and prompt as inputs, then generates a \`report.json\` estimating tokens (tiktoken), cost, and time to run for multiple models.
+
+Then, the batch processing scripts uses the \`report.json\` to create multiple jobs, with subcommands to download and check their status.
+
+We selected gpt-5.4-mini for a good balance of quality, cost, and time.
+
+1. Prompt
+
+ \`\`\`
+ You are an expert policy analyst classifying public comments submitted to the Virginia Town Hall
+ regulatory comment system. You will be given the text of a proposed regulation and a single
+ public comment. Return ONLY a JSON object — no other text.
+
+ Definitions:
+
+ - stance: the commenter's position on whether the regulation should be adopted.
+ "support" = wants it approved (as-is or with changes);
+ "oppose" = wants it rejected or substantially weakened;
+ "neutral" = takes no position, asks a question, or provides factual input only;
+ "unknown" = too vague, off-topic, or uninterpretable to classify.
+ - tone: the emotional register of the writing, independent of stance.
+ "positive" = affirming, hopeful, appreciative;
+ "negative" = angry, fearful, alarmed, or contemptuous;
+ "neutral" = matter-of-fact, procedural, or informational;
+ "mixed" = contains both positive and negative emotional content;
+ "unclear" = tone cannot be determined (e.g., a one-word comment).
+ - stanceconfidence: float 0.0-1.0, your confidence in the stance label.
+ - stancerationale: 1-3 sentences explaining the key evidence; quote specific phrases where possible.
+ - tags: up to 5 short topic labels relevant to the comment's specific concerns (e.g.
+ "parental rights", "student safety", "privacy", "religious freedom", "LGBTQ+ inclusion",
+ "bullying prevention", "school sports", "bathroom access"). Empty array if none apply.
+
+ Return exactly these keys: stance, stanceconfidence, stancerationale, tone, tags.
+ \`\`\`
-
+
-## Analysis
+### Storage
-Google and Amazon both return generic sentiment (tone of writing: positive/negative), not stance (for/against the regulation): "I strongly believe the government should NOT interfere" is negative tone but "against" the regulation. We will run the forum/bill title and cache the entirety of the proposed change, perhaps as a fallback.
-
-
+- Each scraped forum is saved to \`output/.jsonl\`
+- Each report (forum + prompt) is saves to \`reports/.json\`
+- Each job is saved to \`analysis/jobs//:
+ └─\`forum.jsonl\` is a copy of the scraped forum for convenience
+ └─\`prompt.txt\` is a copy of the prompt used
+ └─\`report.json\` is a copy of the report used
+ └─\`status.json\` contains metadata about the job
+ For each batch in the job, four files are created:
+ └─\`jobN-input.jsonl\` contains the exact queries sent to the API, for troubleshooting
+ └─\`jobN-output-raw.jsonl\` contains the exact response from the API
+ └─\`jobN-output.jsonl\` contains the exact response from the API
+ └─\`jobN-output-errors.jsonl\` when errors are returned (this file may not exist)
+- Once complete, the cleanup script saves \`review.csv\`, \`review.pqt\`, and \`review.sqlite\` in this folder.
-
-
+
-
+## Instructions
-
-
-
-
-
-
-
-
-
-
-
Tool
-
Output
-
Context
-
Sarcasm
-
Context window
-
Cost/1k comments
-
-
-
-
-
Google NL API
-
-1→+1, magnitude
-
No/generic
-
Poorly
-
No
-
~$1–2
-
-
-
-
Amazon Comprehend
-
Pos/Neg/Neutral/Mixed
-
No/generic
-
Poorly
-
No
-
~$0.10
-
-
-
-
Claude Haiku
-
Prompted → for/against/neutral
-
Yes
-
Yes, with prompt
-
Yes
-
~$0.10–0.30
-
-
-
-
GPT-4o-mini
-
Prompted → same
-
Yes
-
Yes
-
Yes
-
~$0.05–0.15
-
-
-
+1. Scrape the forum.
+ \`python
+2. Run model report.
+ \`python analysis/tokenizer.py –prompt \`
+3. To run a realtime subset:
+ \`python analysis/openairealtime.py –prompt –model –limit \`
+ \`python analysis/openairealtime.py output/f452.jsonl –prompt prompt-1.txt –model gpt-4o-mini –limit 10\`
+4. To create and run the whole thing in batches, first create the batch jobs from the report:
+ \`python analysis/openaibatch.py create –model \`
+ \`python analysis/openaibatch.py create ./reports/f452-1.json –model gpt-5.4-mini\`
+5. Then, run the jobs sequentially. Don't submit more than one at a time, if the model fills up the batch will fail and resubmission is not implemented.
+ \`python analysis/openaibatch.py submit\`
+
+ \`python analysis/openaibatch.py status\`
+
+ \`python analysis/openaibatch.py download\`
+
+ \`python analysis/openaibatch.py submit\`
-
+
# Roadmap
diff --git a/docs/vatownhall.org b/docs/vatownhall.org
index 128b222..0c12b41 100644
--- a/docs/vatownhall.org
+++ b/docs/vatownhall.org
@@ -1,50 +1,109 @@
#+title: VA Townhall
#+date: [2026-05-05 Tue]
-#+version: 1
+#+version: 1.1
-* Project Goals
+** Project Goals
1. Document and analyze sentiment of public comments on Virginia law, to determine:
1. the utility of this forum as a mechanism for public comment, and
2. the impact of this forum on Virginia regulation.
2. Make data and insights broadly available.
3. Generalize to other public comment tools.
-** Document and analyze sentiment
-- Scrape the data, parse, clean, and store. Clearly separate scraper from sentiment analyzer for maximum auditability.
-- Build tests for identifying abuse, such as spam and account fraud
-- Identify any patterns connecting measured sentiment against VA decisions
-
-** Make data available
-- Pick a good visualization tool
+*** Research questions
+1. What is the quality of the comments on the forum?
+ 1. Are there duplicate entries?
+ 2. Are there non-human-generated entries?
+ 3. Are there entries intended to abuse the forum or drown out comment?
+2. How do commenters feel about the proposed change?
+ 1. What is the total number and percent supporting vs opposing, and how does this change over time?
+ 2. What is the type of support, such as strong/weak, positive/negative?
+3. What impact do the comments have on the proposed change?
+ (I anticipate this will not be measurable from currently available data)
-** Generalize
-- Identify scalable ways to apply this toolset to similar problems
+** Architecture
+1. Scrape/Parse: Scrapy
+2. Sentiment analysis: gpt-5.4-mini
+3. Display: streamlit
+4. Storage: jsonl, csv, parquet
-* Architecture
-1. Scrape/Parse: **Scrapy** for downloading comments
-2. Storage: json
-3. Sentiment analysis: Claude haiku
-4. Display: TBD
-
-** Scraper
-Scrapy provides a simple mechanism for browsing and
+*** Scraper
+Scrapy provides a simple mechanism for retrieving, parsing, and saving content form the forums.
1. Forums listing page: `Forums.cfm` - lists all open forums with agency, reg title, action type, brief description, closing date, comment count
2. Comment listing page: `comments.cfm?GDocForumID=X` or `comments.cfm?stageid=X` or `comments.cfm?petitionid=X` - lists comments with title, author, date
3. Individual comment page: `viewcomments.cfm?commentid=X` - shows regulation title + brief description at the top, plus the comment
-** Storage
-One JSONL file per forum/bill.
+*** Analysis
+Google and Amazon both return generic sentiment (tone of writing: positive/negative), not stance (for/against the regulation): "I strongly believe the government should NOT interfere" is negative tone but "against" the regulation. We add the proposed change as context to the model.
-** Analysis
-Google and Amazon both return generic sentiment (tone of writing: positive/negative), not stance (for/against the regulation): "I strongly believe the government should NOT interfere" is negative tone but "against" the regulation. We will run the forum/bill title and cache the entirety of the proposed change, perhaps as a fallback.
+Before sending the comments for sentiment analysis, `tokenizer.py` receives the forum to be processed and prompt as inputs, then generates a `report.json` estimating tokens (tiktoken), cost, and time to run for multiple models.
-| Tool | Output | Context | Sarcasm | Context window | Cost/1k comments |
-|-------------------+--------------------------------+------------+------------------+----------------+------------------|
-| Google NL API | -1→+1, magnitude | No/generic | Poorly | No | ~$1–2 |
-| Amazon Comprehend | Pos/Neg/Neutral/Mixed | No/generic | Poorly | No | ~$0.10 |
-| Claude Haiku | Prompted → for/against/neutral | Yes | Yes, with prompt | Yes | ~$0.10–0.30 |
-| GPT-4o-mini | Prompted → same | Yes | Yes | Yes | ~$0.05–0.15 |
+Then, the batch processing scripts uses the `report.json` to create multiple jobs, with subcommands to download and check their status.
+We selected gpt-5.4-mini for a good balance of quality, cost, and time.
+
+**** Prompt
+```
+You are an expert policy analyst classifying public comments submitted to the Virginia Town Hall
+regulatory comment system. You will be given the text of a proposed regulation and a single
+public comment. Return ONLY a JSON object — no other text.
+
+Definitions:
+- stance: the commenter's position on whether the regulation should be adopted.
+ "support" = wants it approved (as-is or with changes);
+ "oppose" = wants it rejected or substantially weakened;
+ "neutral" = takes no position, asks a question, or provides factual input only;
+ "unknown" = too vague, off-topic, or uninterpretable to classify.
+- tone: the emotional register of the writing, independent of stance.
+ "positive" = affirming, hopeful, appreciative;
+ "negative" = angry, fearful, alarmed, or contemptuous;
+ "neutral" = matter-of-fact, procedural, or informational;
+ "mixed" = contains both positive and negative emotional content;
+ "unclear" = tone cannot be determined (e.g., a one-word comment).
+- stance_confidence: float 0.0-1.0, your confidence in the stance label.
+- stance_rationale: 1-3 sentences explaining the key evidence; quote specific phrases where possible.
+- tags: up to 5 short topic labels relevant to the comment's specific concerns (e.g.
+ "parental rights", "student safety", "privacy", "religious freedom", "LGBTQ+ inclusion",
+ "bullying prevention", "school sports", "bathroom access"). Empty array if none apply.
+
+Return exactly these keys: stance, stance_confidence, stance_rationale, tone, tags.
+```
+
+
+*** Storage
+- Each scraped forum is saved to `output/.jsonl`
+- Each report (forum + prompt) is saves to `reports/.json`
+- Each job is saved to `analysis/jobs//:
+ └─`forum.jsonl` is a copy of the scraped forum for convenience
+ └─`prompt.txt` is a copy of the prompt used
+ └─`report.json` is a copy of the report used
+ └─`status.json` contains metadata about the job
+ For each batch in the job, four files are created:
+ └─`jobN-input.jsonl` contains the exact queries sent to the API, for troubleshooting
+ └─`jobN-output-raw.jsonl` contains the exact response from the API
+ └─`jobN-output.jsonl` contains the exact response from the API
+ └─`jobN-output-errors.jsonl` when errors are returned (this file may not exist)
+- Once complete, the cleanup script saves `review.csv`, `review.pqt`, and `review.sqlite` in this folder.
+
+** Instructions
+1. Scrape the forum.
+ `python
+2. Run model report.
+ `python analysis/tokenizer.py --prompt `
+3. To run a realtime subset:
+ `python analysis/openai_realtime.py --prompt --model --limit `
+ `python analysis/openai_realtime.py output/f452.jsonl --prompt prompt-1.txt --model gpt-4o-mini --limit 10`
+4. To create and run the whole thing in batches, first create the batch jobs from the report:
+ `python analysis/openai_batch.py create --model `
+ `python analysis/openai_batch.py create ./reports/f452-1.json --model gpt-5.4-mini`
+5. Then, run the jobs sequentially. Don't submit more than one at a time, if the model fills up the batch will fail and resubmission is not implemented.
+ `python analysis/openai_batch.py submit`
+ # Check status
+ `python analysis/openai_batch.py status`
+ # When complete, download:
+ `python analysis/openai_batch.py download`
+ # Submit the next batch after the previous is complete:
+ `python analysis/openai_batch.py submit`
+
* Roadmap
1. Scrape one forum
2. Compare sentiment models