From 25a17cb691c49ed1cc7036a693cfe6c8f265f21e Mon Sep 17 00:00:00 2001 From: eulaly Date: Sat, 9 May 2026 00:16:44 -0400 Subject: [PATCH] updated readme --- README.md | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-) diff --git a/README.md b/README.md index 1d897c1..849b20e 100644 --- a/README.md +++ b/README.md @@ -110,30 +110,32 @@ We selected gpt-5.4-mini for a good balance of quality, cost, and time. ## Instructions -1. Scrape the forum. - `python` -2. Run model report. +1. Clone repo and install dependencies: + `python -m pip install -r requirements.txt` +2. Scrape the forum based on the ID in the URL. + `scrapy crawl forum -a forum_id= -s LOG_LEVEL=WARNING 2>&1` +3. Run model report. `python analysis/tokenizer.py --prompt ` -3. To run a realtime subset: +4. To run a realtime subset: `python analysis/openai_realtime.py --prompt --model --limit ` `python analysis/openai_realtime.py output/f452.jsonl --prompt prompt-1.txt --model gpt-4o-mini --limit 10` -4. To create and run the whole thing in batches, first create the batch jobs from the report: +5. To create and run the whole thing in batches, first create the batch jobs from the report: `python analysis/openai_batch.py create --model ` `python analysis/openai_batch.py create ./reports/f452-1.json --model gpt-5.4-mini` -5. Then, run the jobs sequentially. Don't submit more than one at a time, if the model fills up the batch will fail and resubmission is not implemented. - `python analysis/openaibatch.py submit` - `python analysis/openaibatch.py status` - `python analysis/openaibatch.py download` - `python analysis/openaibatch.py submit` +6. Then, run the jobs sequentially. Don't submit more than one at a time, if the model fills up the batch will fail and resubmission is not implemented. + `python analysis/openai_batch.py submit` + `python analysis/openai_batch.py status` + `python analysis/openai_batch.py download` + `python analysis/openai_batch.py submit` # Roadmap -1. Scrape one forum -2. Compare sentiment models -3. Display -4. Scrape all data -5. Scale? +1. /Done/ Scrape one forum, check sentiment, display +2. Test different models +3. Build batch runner + +