From 02964312cb070ba8e99bb72ee520467d094f3616 Mon Sep 17 00:00:00 2001 From: eulaly Date: Tue, 5 May 2026 11:38:57 -0400 Subject: [PATCH] update readme --- README.md | 50 ++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 38 insertions(+), 12 deletions(-) diff --git a/README.md b/README.md index c5b07fa..862871a 100644 --- a/README.md +++ b/README.md @@ -1,16 +1,19 @@ # Table of Contents -1. [Project Goals](#org863a759) -2. [Architecture](#orgcd91fd0) - 1. [Scraper](#org3256ad3) - 2. [Storage](#org7a9a92c) - 3. [Analysis](#org6ed72dc) -3. [Roadmap](#org416f14d) +1. [Project Goals](#org5acb669) + 1. [Document and analyze sentiment](#org9291576) + 2. [Make data available](#org8054421) + 3. [Generalize](#orgdda4b6f) +2. [Architecture](#org1d6bc40) + 1. [Scraper](#org4298028) + 2. [Storage](#org1cd413c) + 3. [Analysis](#orgaea450e) +3. [Roadmap](#org6b7660d) - + # Project Goals @@ -21,7 +24,30 @@ 3. Generalize to other public comment tools. - + + +## Document and analyze sentiment + +- Scrape the data, parse, clean, and store. Clearly separate scraper from sentiment analyzer for maximum auditability. +- Build tests for identifying abuse, such as spam and account fraud +- Identify any patterns connecting measured sentiment against VA decisions + + + + +## Make data available + +- Pick a good visualization tool + + + + +## Generalize + +- Identify scalable ways to apply this toolset to similar problems + + + # Architecture @@ -31,7 +57,7 @@ 4. Display: TBD - + ## Scraper @@ -42,14 +68,14 @@ Scrapy provides a simple mechanism for browsing and 3. Individual comment page: \`viewcomments.cfm?commentid=X\` - shows regulation title + brief description at the top, plus the comment - + ## Storage One JSONL file per forum/bill. - + ## Analysis @@ -121,7 +147,7 @@ Google and Amazon both return generic sentiment (tone of writing: positive/negat - + # Roadmap