updated readme

2026-05-09 00:16:44 -04:00
parent 771f11fd3c
commit 25a17cb691
1 changed files with 17 additions and 15 deletions
--- a/README.md
+++ b/README.md
@@ -110,30 +110,32 @@ We selected gpt-5.4-mini for a good balance of quality, cost, and time.

 ## Instructions

-1.  Scrape the forum.  
-    `python`  
-2.  Run model report.  
+1.  Clone repo and install dependencies:
+    `python -m pip install -r requirements.txt`
+2.  Scrape the forum based on the ID in the URL.  
+    `scrapy crawl forum -a forum_id=<forum_id> -s LOG_LEVEL=WARNING 2>&1`  
+3.  Run model report.  
    `python analysis/tokenizer.py <input> --prompt <prompt>`  
-3.  To run a realtime subset:  
+4.  To run a realtime subset:  
    `python analysis/openai_realtime.py <input> --prompt <prompt> --model <model> --limit <N comments>`  
    `python analysis/openai_realtime.py output/f452.jsonl --prompt prompt-1.txt --model gpt-4o-mini --limit 10`  
-4.  To create and run the whole thing in batches, first create the batch jobs from the report:  
+5.  To create and run the whole thing in batches, first create the batch jobs from the report:  
    `python analysis/openai_batch.py create <report> --model <model>`  
    `python analysis/openai_batch.py create ./reports/f452-1.json --model gpt-5.4-mini`  
-5.  Then, run the jobs sequentially. Don't submit more than one at a time, if the model fills up the batch will fail and resubmission is not implemented.  
-    `python analysis/openai<sub>batch.py</sub> submit`  
-    `python analysis/openai<sub>batch.py</sub> status`  
-    `python analysis/openai<sub>batch.py</sub> download`  
-    `python analysis/openai<sub>batch.py</sub> submit`  
+6.  Then, run the jobs sequentially. Don't submit more than one at a time, if the model fills up the batch will fail and resubmission is not implemented.  
+    `python analysis/openai_batch.py</sub> submit`  
+    `python analysis/openai_batch.py</sub> status`  
+    `python analysis/openai_batch.py</sub> download`  
+    `python analysis/openai_batch.py</sub> submit`  


 <a id="org5739d49"></a>

 # Roadmap

-1.  Scrape one forum
-2.  Compare sentiment models
-3.  Display
-4.  Scrape all data
-5.  Scale?
+1.  /Done/ Scrape one forum, check sentiment, display
+2.  Test different models
+3.  Build batch runner
+
+