Project Goals

Document and analyze sentiment of public comments on Virginia law, to determine:
1. the utility of this forum as a mechanism for public comment, and
2. the impact of this forum on Virginia regulation.
Make data and insights broadly available.
Generalize to other public comment tools.

Architecture

Scrape/Parse: Scrapy for downloading comments
Storage: json
Sentiment analysis: Claude haiku
Display: TBD

Scraper

Scrapy provides a simple mechanism for browsing and

Forums listing page: `Forums.cfm` - lists all open forums with agency, reg title, action type, brief description, closing date, comment count
Comment listing page: `comments.cfm?GDocForumID=X` or `comments.cfm?stageid=X` or `comments.cfm?petitionid=X` - lists comments with title, author, date
Individual comment page: `viewcomments.cfm?commentid=X` - shows regulation title + brief description at the top, plus the comment

Storage

One JSONL file per forum/bill.

Analysis

Google and Amazon both return generic sentiment (tone of writing: positive/negative), not stance (for/against the regulation): "I strongly believe the government should NOT interfere" is negative tone but "against" the regulation. We will run the forum/bill title and cache the entirety of the proposed change, perhaps as a fallback.

Tool	Output	Context	Sarcasm	Context window	Cost/1k comments
Google NL API	-1→+1, magnitude	No/generic	Poorly	No	~$1–2
Amazon Comprehend	Pos/Neg/Neutral/Mixed	No/generic	Poorly	No	~$0.10
Claude Haiku	Prompted → for/against/neutral	Yes	Yes, with prompt	Yes	~$0.10–0.30
GPT-4o-mini	Prompted → same	Yes	Yes	Yes	~$0.05–0.15

Roadmap

Scrape one forum
Compare sentiment models
Display
Scrape all data
Scale?

3.3 KiB Raw Blame History Unescape Escape

Table of Contents