Files
vath/docs/tasks.org

32 lines
1.2 KiB
Org Mode

* [X] t1.1: scrape one forum (1)
Use https://www.townhall.virginia.gov/L/comments.cfm?GDocForumID=452 as the first forum. Scraper should be run manually at this step.
ViewComments (townhall.virginia.gov/L/ViewComments.cfm?CommentID=#) appears to be raw list of all comments on forum - could be useful later for whole-scrape
Append forum id to viewall per forum (townhall.virginia.gov/L/ViewComments.cfm?GdocForumID=452)
Comments are hydrated in backend via js-cued button (AJAX?)
** acceptance criteria
1. run manual scraper
1. store proposal title and description
2. store comment title, commenter, date
3. store relevant metadata
2. friendly/polite scraping
** notes
** evidence
- commit: beb5cf4
- tests: 7 passing (pytest tests/)
- datetime: 2026-05-05 12:26
* [ ] t1.2: initial analysis pipeline
Write a simple pipeline for both - prefer non-concurrent/async from scraping run. Should be run manually, separate from scraper. You may use scrapy, but are not required to.
** acceptance criteria
1. run manual sentiment analysis of selected file against haiku
2. run manual sentiment analysis of selected file against gpt-4o
** notes
** evidence
- commit:
- tests:
- date: