updated readme

This commit is contained in:
2026-05-09 00:16:44 -04:00
parent 771f11fd3c
commit 25a17cb691

View File

@@ -110,30 +110,32 @@ We selected gpt-5.4-mini for a good balance of quality, cost, and time.
## Instructions
1. Scrape the forum.
`python`
2. Run model report.
1. Clone repo and install dependencies:
`python -m pip install -r requirements.txt`
2. Scrape the forum based on the ID in the URL.
`scrapy crawl forum -a forum_id=<forum_id> -s LOG_LEVEL=WARNING 2>&1`
3. Run model report.
`python analysis/tokenizer.py <input> --prompt <prompt>`
3. To run a realtime subset:
4. To run a realtime subset:
`python analysis/openai_realtime.py <input> --prompt <prompt> --model <model> --limit <N comments>`
`python analysis/openai_realtime.py output/f452.jsonl --prompt prompt-1.txt --model gpt-4o-mini --limit 10`
4. To create and run the whole thing in batches, first create the batch jobs from the report:
5. To create and run the whole thing in batches, first create the batch jobs from the report:
`python analysis/openai_batch.py create <report> --model <model>`
`python analysis/openai_batch.py create ./reports/f452-1.json --model gpt-5.4-mini`
5. Then, run the jobs sequentially. Don't submit more than one at a time, if the model fills up the batch will fail and resubmission is not implemented.
`python analysis/openai<sub>batch.py</sub> submit`
`python analysis/openai<sub>batch.py</sub> status`
`python analysis/openai<sub>batch.py</sub> download`
`python analysis/openai<sub>batch.py</sub> submit`
6. Then, run the jobs sequentially. Don't submit more than one at a time, if the model fills up the batch will fail and resubmission is not implemented.
`python analysis/openai_batch.py</sub> submit`
`python analysis/openai_batch.py</sub> status`
`python analysis/openai_batch.py</sub> download`
`python analysis/openai_batch.py</sub> submit`
<a id="org5739d49"></a>
# Roadmap
1. Scrape one forum
2. Compare sentiment models
3. Display
4. Scrape all data
5. Scale?
1. /Done/ Scrape one forum, check sentiment, display
2. Test different models
3. Build batch runner