diff --git a/README.md b/README.md index 773a8d2..21a97b1 100644 --- a/README.md +++ b/README.md @@ -1,17 +1,3 @@ -# Table of Contents - -1. [Project Goals](#org2da6874) - 1. [Research questions](#org1a2b8b3) - 2. [Architecture](#orgfabfcd9) - 1. [Scraper](#org2c5c7a2) - 2. [Analysis](#org72990f4) - 3. [Storage](#org58a5b72) - 3. [Instructions](#org24fe465) -1. [Roadmap](#org5739d49) - - - - ## Project Goals @@ -21,7 +7,7 @@ 2. Make data and insights broadly available. 3. Generalize to other public comment tools. - +![img](./docs/streamlit-snapshot.svg) ### Research questions @@ -66,9 +52,9 @@ Scrapy provides a simple mechanism for retrieving, parsing, and saving content f Google and Amazon both return generic sentiment (tone of writing: positive/negative), not stance (for/against the regulation): "I strongly believe the government should NOT interfere" is negative tone but "against" the regulation. We add the proposed change as context to the model. -Before sending the comments for sentiment analysis, \`tokenizer.py\` receives the forum to be processed and prompt as inputs, then generates a \`report.json\` estimating tokens (tiktoken), cost, and time to run for multiple models. +Before sending the comments for sentiment analysis, `tokenizer.py` receives the forum to be processed and prompt as inputs, then generates a `report.json` estimating tokens (tiktoken), cost, and time to run for multiple models. -Then, the batch processing scripts uses the \`report.json\` to create multiple jobs, with subcommands to download and check their status. +Then, the batch processing scripts uses the `report.json` to create multiple jobs, with subcommands to download and check their status. We selected gpt-5.4-mini for a good balance of quality, cost, and time.