initial ytdlp worker

finalized v2 architecture for 2.0.1
2026-03-31 20:54:56 -04:00 · 2026-03-31 20:54:23 -04:00 · 2026-03-31 18:48:10 -04:00 · 2026-03-31 16:53:28 -04:00 · 2026-03-31 16:23:37 -04:00 · 2026-03-31 16:22:46 -04:00
14 changed files with 1184 additions and 141 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -1,13 +1,34 @@
-*.pyc
-*.org
-*.ps1
-.env
-#*
-.#*
+# --- python bytecode ---
 __pycache__/
+*.py[cod]
+*$py.class
+
+# --- virtual environments ---
+.venv/
 venv/
+env/
+
+# --- environment files ---
+.env
+.env.*
+*.local
+
+# --- emacs ---
+*~
+\#*\#
+.\#*
+*.elc
+
+# --- project private data ---
+/private/
 archive/
-config/
 downloads/
 data.json

+# --- django ---
+db.sqlite3
+staticfiles/
+media/
+
+# --- misc ---
+.DS_Store
--- a/README.md
+++ b/README.md
@@ -1,3 +1,5 @@
+v2 architecture draft: see `docs/architecture-v2.org`
+
 build and run the docker container
 ```
 api_token = [discord bot token]
--- a/agents.md
+++ b/agents.md
@@ -0,0 +1,24 @@
+# agent rules
+
+## priorities
+- optimize for simplicity, boringness, and long-term maintainability
+- prefer minimal diffs; avoid refactors unless required for the active task
+
+## tech stack
+- python
+- file storage: json and csv, no sqlite or databases
+- assume local virtual env is available and accessible
+- do not add new dependencies unless explicitly approved; if unavoidable, document justification in the active task notes
+
+## workflow
+- work on ONE task at a time unless explicitly instructed otherwise
+- at the start of work, state the task id you are executing
+- do not start work unless a task id is specified; if missing, choose the earliest unchecked task and say so
+- propose incremental steps
+- always include basic tests for core logic
+- add assumptions and questions along-the-way to the ** notes section under the active task 
+- when you complete a task:
+  - mark it [X] in pm/tasks.org
+  - fill in evidence with commit hash + commands run
+  - never mark complete unless acceptance criteria are met
+  - include date and time (HH:MM)
--- a/default-yt-dlp.conf
+++ b/default-yt-dlp.conf
@@ -1,11 +1,8 @@
-# yt-dlp config file (yt-dlp.conf or .config/yt-dlp/config)
--simulate
+# yt-dlp config file
 -f "bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best"
 --embed-chapters
 --embed-info-json
 --write-playlist-metafiles
-#--paths "{"home":"./downloads"}"
 --download-archive "/config/archive.txt"
 --restrict-filenames
 --output "/downloads/%(uploader)s/%(playlist_title)s/%(playlist_index)s%(playlist_index& - )s%(title)s.%(ext)s"
-# --split-chapters
--- a/docs/architecture-v2.org
+++ b/docs/architecture-v2.org
@@ -0,0 +1,248 @@
+#+TITLE: youdis v2 architecture
+#+DATE: [2026-03-31 Tue]
+#+VERSION: 3
+
+* Purpose
+Youdis v2 splits the current monolithic Discord bot into:
+- a private backend worker service that owns yt-dlp execution
+- thin chat adapters that translate user actions into backend requests
+
+The goal is to make downloader behavior inspectable, reusable, and easier to debug across multiple frontends without introducing queueing infrastructure or complex deployment.
+
+** Goals
+- Keep the backend transport-agnostic.
+- Preserve archive-first duplicate prevention via `archive.txt`.
+- Keep single-job semantics explicit.
+- Treat requester and origin metadata as informational, not authorization.
+- Make yt-dlp configuration understandable and debuggable.
+
+** Non-Goals
+- No backend auth or user membership logic.
+- No database, Redis, or external queue.
+- No multi-worker scheduling in v2.
+- No frontend-owned downloader logic.
+
+* Proposed Components
+** Backend worker service
+Owns:
+- request validation
+- single active job state
+- yt-dlp executable invocation
+- yt-dlp configuration selection and runtime argument construction
+- archive behavior
+- output path behavior
+- progress and final result events
+
+Does not own:
+- Discord slash-command parsing
+- chat formatting decisions
+- access control policy
+
+** Frontend adapters
+1. Discord
+2. Zulip
+3. XMPP
+
+Own:
+- command parsing
+- user-facing message formatting
+- transport-specific reply behavior
+- optional frontend-side gating if the deployment wants it
+
+Do not own:
+- yt-dlp option construction
+- archive checks
+- output path logic
+- job lifecycle rules
+
+* Initial Deployment Shape
+
+V2 should start as a single private process boundary with two layers:
+1. backend service process
+2. one Discord adapter process
+
+The backend is intended for private/local deployment. Trust comes from deployment topology and network placement, not backend auth.
+
+** Execution model
+The backend is a thin FastAPI wrapper around the `yt-dlp` executable, not a deep Python integration, unless proven necessary.
+
+Why this is the default:
+- keeps behavior closer to native `yt-dlp`
+- makes container debugging easier because commands can be reproduced manually
+- avoids reimplementing `yt-dlp` option parsing unless we truly need tighter integration
+
+This means the backend is primarily responsible for:
+- deciding when `yt-dlp` may run
+- constructing a safe effective command
+- supervising the subprocess while it is active
+- translating process outcomes into a small API contract
+
+** Backend Contract
+*** Request model
+Minimal fields:
+- `url`
+- `requester_id` optional
+- `requester_name` optional
+- `origin` optional
+- `requested_at` optional
+Only `url` affects downloader behavior in v2. The rest is passthrough metadata for logs and frontend context.
+
+*** Result states
+The backend should keep its external state model deliberately small:
+- `accepted`
+- `busy`
+- `running`
+- `completed`
+- `failed`
+- `cancelled`
+
+These states should be transport-agnostic, human-debuggable, and only distinct when a frontend would behave differently.
+
+Internal phases and details may be richer, but they should usually be carried as metadata rather than promoted to top-level states.
+
+Examples of detail fields that can travel alongside the small state model:
+- `phase=validating`
+- `phase=downloading`
+- `disposition=archive_hit`
+- `message=already in archive`
+- `result_path=/downloads/...`
+
+This keeps the API simple while still leaving room for useful operator-facing detail.
+
+*** Minimal endpoints
+The first backend seam should stay small:
+- `POST /jobs`
+- `GET /jobs/current`
+- `POST /jobs/current/cancel`
+- `GET /health`
+- `GET /version`
+
+Polling is the default integration model for v2. Streaming or push delivery is deferred unless a real frontend need appears.
+
+** Single-Job Semantics
+V2 explicitly supports one active job at a time.
+- If idle, the backend accepts a new job and marks it active.
+- If busy, the backend rejects the new request with a `busy` result.
+- Cancel is coarse and best-effort.
+- The backend clears active state when the job reaches `completed`, `failed`, or `cancelled`.
+
+This is a feature, not a limitation, for the initial service.
+
+* yt-dlp Configuration Ownership
+
+This is the most important v2 cleanup.
+
+The backend owns all yt-dlp invocation behavior. Frontends must not build `yt-dlp` commands or option dictionaries.
+
+** Config split
+
+Static settings belong in `default-yt-dlp.conf`:
+- archive path
+- output template
+- embedding and metadata preferences
+- stable format defaults
+- retry defaults
+
+Runtime settings belong in backend code:
+- target URL
+- config-file path, if explicitly set at launch
+- request-scoped flags or overrides
+- subprocess lifecycle and cancellation behavior
+- any values that must vary per request
+
+*** Merge rule
+The backend should invoke `yt-dlp` with the default config first, then apply a small set of explicit runtime overrides.
+
+If a runtime override conflicts with file-backed config, the override wins and the backend should log the effective value used.
+
+*** Debuggability requirement
+For each job, the backend should make it easy to inspect:
+
+- config file path used
+- effective `yt-dlp` command or key override arguments
+- final normalized job result
+
+This exists specifically because the current config behavior has been difficult to reason about.
+
+*** Config hygiene expectations
+
+The default config should be safe for real downloads in production-like use.
+
+That means test-only settings such as `--simulate` should not remain enabled in the default runtime path unless the backend intentionally supports an explicit dry-run mode.
+
+** Recommended Repo Shape
+One reasonable first cut:
+
+#+begin_quote
+youdis/
+  README.md
+  default-yt-dlp.conf
+  youdis/
+    __init__.py
+    main.py
+    models.py
+    adapters/
+      __init__.py
+      discord.py
+  docs/
+    architecture-v2.org
+#+end_quote
+
+Notes:
+
+- `main.py` owns the FastAPI app, active-job state, and `yt-dlp` subprocess lifecycle.
+- `models.py` owns request and response models.
+- `adapters/discord.py` becomes a thin client of the backend.
+
+** Example response shape
+
+One likely response shape for v1:
+
+#+begin_src json
+{
+  "state": "completed",
+  "disposition": "archive_hit",
+  "message": "already in archive",
+  "job_id": "abc123"
+}
+#+end_src
+
+And for an active job:
+
+#+begin_src json
+{
+  "state": "running",
+  "phase": "downloading",
+  "message": "downloading 3 of 9",
+  "job_id": "abc123"
+}
+#+end_src
+
+** Framework Recommendation
+
+FastAPI is the current recommendation for the backend seam.
+
+Why it fits:
+
+- typed request and response models help define the contract cleanly
+- built-in docs make local inspection easier during early iterations
+- it stays small enough for this service if we keep the surface area disciplined
+
+Why this is not a strong ideological choice:
+
+- Flask would also work if we decide the type/model ergonomics are not worth it
+- the important decision is keeping the seam small, not choosing a fashionable framework
+
+** Immediate Next Task
+
+`2.0.1` should implement the backend skeleton with just enough real behavior to prove the seam:
+
+- package/module layout
+- health and version endpoint
+- single active-job state holder
+- one submit-job path
+- one current-job status path
+- `yt-dlp` subprocess invocation using `default-yt-dlp.conf`
+- explicit runtime overrides for request URL and cancel behavior
+
+The Discord adapter should remain unchanged until that backend skeleton can accept a job and report status coherently.
--- a/pm/task-sample.org
+++ b/pm/task-sample.org
@@ -0,0 +1,81 @@
+#+title: Task Log
+#+updated: [2026-03-31 Tue 16:03]
+
+Use the template below, which should be a top-level org-mode header.
+A sample task is below the template
+
+* [ ] M.m.m: Task Title (<estimated # of commits>)
+description of the task
+** pm notes: amplifying information
+
+** Acceptance Criteria
+1. Criterion
+   - expanded data
+2. Criterion
+3f. Criterion added after initial task completion   
+
+** evidence
+- commit: abc123, bcd234
+- tests: 
+- datetime: [2026-03-18 Wed 14:15]
+
+** notes    
+- explanation of work done, decisions made, reasoning
+
+  
+* [ ] 2.0.1: build backend yt-dlp worker (3)
+create the minimal backend/service skeleton and establish a working yt-dlp baseline with clean hooks for future frontends
+** pm notes
+- foundation; don't need the full finished service here, just the basic shape plus enough real yt-dlp execution to validate the seam and build on it.
+- keep single-job semantics
+- prioritize inspectable behavior over polish
+
+** Acceptance Criteria
+1. create the backend/service skeleton
+   - app/module layout exists
+   - core request path is stubbed or minimally working
+2. establish a working yt-dlp baseline
+   - archive behavior is preserved
+   - output path behavior is preserved or intentionally updated
+   - use yt-dlp .conf and set reasonable default
+3. expose basic hooks/interfaces for future frontends
+   - submit/request path exists
+   - status/progress hook exists
+   - basic health/version visibility exists
+4f. print env vars to stdout on fastapi launch
+   - repo_root
+   - default_config
+   - ytdlp_executable
+5f. create ytdlp conf to facilitate tests     
+** evidence
+- commit:
+- tests:
+  1. `python3 -m py_compile ./youdis/main.py ./youdis/models.py ./youdis/adapters/__init__.py ./youdis/adapters/discord.py`
+  2. `python3 -m uvicorn youdis.main:app --host 127.0.0.1 --port 8000`
+  3. `sudo curl -L https://github.com/yt-dlp/yt-dlp/releases/download/2026.03.17/yt-dlp -o /usr/local/bin/yt-dlp`
+  4. `sudo chmod +x /usr/local/bin/yt-dlp`
+  5. `yt-dlp --version`
+  6. `curl http://127.0.0.1:8000/health`
+  7. `curl http://127.0.0.1:8000/version`
+  8. `curl -X POST http://127.0.0.1:8000/jobs -H 'content-type: application/json' -d '{"url":"https://www.youtube.com/watch?v=dQw4w9WgXcQ"}'`
+  9. `curl http://127.0.0.1:8000/jobs/current`
+  10. if testing in container: `docker build -t youdis:v2 .`
+  11. if testing in container: `docker run --rm -p 8000:8000 -v [config]:/config -v [downloads]:/downloads youdis:v2`
+  :OUTPUT_STEP1-9:
+  user@paladin:~/proj/youdis$ curl http://127.0.0.1:8000/health
+{"status":"ok"}user@paladin:~/proj/youdis$
+user@paladin:~/proj/youdis$ curl http://127.0.0.1:8000/version
+{"version":"20250829-ec72c56","active_job":false}user@paladin:~/proj/youdis$
+user@paladin:~/proj/youdis$ curl -X POST http://127.0.0.1:8000/jobs -H 'content-type: application/json' -d '{"url":"https://www.youtube.com/watch?v=3i72yY_LaW4"}'
+{"job_id":"cc85165e-d906-4eee-864f-1a398b6de2e0","state":"accepted","url":"https://www.youtube.com/watch?v=3i72yY_LaW4","message":"accepted","phase":"queued","disposition":null,"requester_id":null,"requester_name":null,"origin":null,"result_path":null,"command":[],"returncode":null,"created_at":"2026-04-01T00:44:35.657196","updated_at":"2026-04-01T00:44:35.657198"}user@paladin:~/proj/youdis$ curl http://127.0.0.1:8000/jobs/current
+{"active":false,"job":{"job_id":"cc85165e-d906-4eee-864f-1a398b6de2e0","state":"failed","url":"https://www.youtube.com/watch?v=3i72yY_LaW4","message":"ERROR: unable to create directory [Errno 13] Permission denied: '/downloads'","phase":null,"disposition":null,"requester_id":null,"requester_name":null,"origin":null,"result_path":null,"command":["yt-dlp","--config-locations","/home/user/proj/youdis/default-yt-dlp.conf","https://www.youtube.com/watch?v=3i72yY_LaW4"],"returncode":1,"created_at":"2026-04-01T00:44:35.657196","updated_at":"2026-03-31T20:44:36.653353"}}user@paladin:~/proj/youdis$
+  :END:
+- datetime: [2026-03-31 Tue 20:45]
+
+** notes
+- validate `yt-dlp` subprocess invocation in-container; not verifiable in the current shell because `yt-dlp` is not installed here
+- confirm `--config-locations` behavior against the installed `yt-dlp` version during integration testing
+- current backend scaffold is not yet wired into `dockerfile` or `run-youdis.sh`
+- archive-hit and result-path parsing currently depend on `yt-dlp` stdout text patterns, so treat them as provisional until integration-tested
+
+  
--- a/pm/tasks-v2.org
+++ b/pm/tasks-v2.org
@@ -0,0 +1,165 @@
+#+title: Task Log
+#+updated: [2026-03-31 Tue 16:03]
+#+startup: overview
+
+* youdis v2 goals:
+1. Separate backend from frontend
+2. Offload auth
+3. Ensure auto nightly builds
+4. Default output format supports plex browsing youtube channels as "tv shows"
+5. Facilitate multiple GUI inbounds: Discord, Zulip, XMPP
+
+* [X] 2.0.0: define architecture (2-4)
+define the target architecture for a private backend yt-dlp worker with thin chat frontends
+** pm notes:
+- keep this iterative. the point is to choose the shape and seam, not prematurely implement infra. likely decisions include backend framework, request/status model, and how thin the discord shim should be.
+- goal is simple, private, maintainable deployment
+- avoid auth, queues, or persistence beyond clear & immediate needs
+
+** Acceptance Criteria
+1. document the target architecture at a high level
+   - backend owns yt-dlp execution and job state
+   - frontends own chat-specific UX
+2. identify key decisions still open
+   - backend choice
+   - service seam/endpoints
+   - status/progress model
+3. capture enough structure to begin implementation
+   - repo/component layout is sketched
+   - next implementation task is unblocked
+
+** evidence
+- commit: 0ed16ec
+- tests: n/a
+- datetime: [2026-03-31 Tue 18:48]
+
+** notes
+- first architecture draft captured in `docs/architecture-v2.org`
+
+* [ ] 2.0.1: build backend yt-dlp worker (3)
+create the minimal backend/service skeleton and establish a working yt-dlp baseline with clean hooks for future frontends
+** pm notes
+- foundation; don't need the full finished service here, just the basic shape plus enough real yt-dlp execution to validate the seam and build on it.
+- keep single-job semantics
+- prioritize inspectable behavior over polish
+
+** Acceptance Criteria
+1. create the backend/service skeleton
+   - app/module layout exists
+   - core request path is stubbed or minimally working
+2. establish a working yt-dlp baseline
+   - archive behavior is preserved
+   - output path behavior is preserved or intentionally updated
+   - use yt-dlp .conf and set reasonable default
+3. expose basic hooks/interfaces for future frontends
+   - submit/request path exists
+   - status/progress hook exists
+   - basic health/version visibility exists
+4f.      
+
+** evidence
+- commit:
+- tests:
+  1. `python3 -m py_compile ./youdis/main.py ./youdis/models.py ./youdis/adapters/__init__.py ./youdis/adapters/discord.py`
+  2. `python3 -m uvicorn youdis.main:app --host 127.0.0.1 --port 8000`
+  3. `sudo curl -L https://github.com/yt-dlp/yt-dlp/releases/download/2026.03.17/yt-dlp -o /usr/local/bin/yt-dlp`
+  4. `sudo chmod +x /usr/local/bin/yt-dlp`
+  5. `yt-dlp --version`
+  6. `curl http://127.0.0.1:8000/health`
+  7. `curl http://127.0.0.1:8000/version`
+  8. `curl -X POST http://127.0.0.1:8000/jobs -H 'content-type: application/json' -d '{"url":"https://www.youtube.com/watch?v=dQw4w9WgXcQ"}'`
+  9. `curl http://127.0.0.1:8000/jobs/current`
+  10. if testing in container: `docker build -t youdis:v2 .`
+  11. if testing in container: `docker run --rm -p 8000:8000 -v [config]:/config -v [downloads]:/downloads youdis:v2`
+  :OUTPUT_STEP1-9:
+  user@paladin:~/proj/youdis$ curl http://127.0.0.1:8000/health
+{"status":"ok"}user@paladin:~/proj/youdis$
+user@paladin:~/proj/youdis$ curl http://127.0.0.1:8000/version
+{"version":"20250829-ec72c56","active_job":false}user@paladin:~/proj/youdis$
+user@paladin:~/proj/youdis$ curl -X POST http://127.0.0.1:8000/jobs -H 'content-type: application/json' -d '{"url":"https://www.youtube.com/watch?v=3i72yY_LaW4"}'
+{"job_id":"cc85165e-d906-4eee-864f-1a398b6de2e0","state":"accepted","url":"https://www.youtube.com/watch?v=3i72yY_LaW4","message":"accepted","phase":"queued","disposition":null,"requester_id":null,"requester_name":null,"origin":null,"result_path":null,"command":[],"returncode":null,"created_at":"2026-04-01T00:44:35.657196","updated_at":"2026-04-01T00:44:35.657198"}user@paladin:~/proj/youdis$ curl http://127.0.0.1:8000/jobs/current
+{"active":false,"job":{"job_id":"cc85165e-d906-4eee-864f-1a398b6de2e0","state":"failed","url":"https://www.youtube.com/watch?v=3i72yY_LaW4","message":"ERROR: unable to create directory [Errno 13] Permission denied: '/downloads'","phase":null,"disposition":null,"requester_id":null,"requester_name":null,"origin":null,"result_path":null,"command":["yt-dlp","--config-locations","/home/user/proj/youdis/default-yt-dlp.conf","https://www.youtube.com/watch?v=3i72yY_LaW4"],"returncode":1,"created_at":"2026-04-01T00:44:35.657196","updated_at":"2026-03-31T20:44:36.653353"}}user@paladin:~/proj/youdis$
+  :END:
+- datetime: [2026-03-31 Tue 20:45]
+
+** notes
+- validate `yt-dlp` subprocess invocation in-container; not verifiable in the current shell because `yt-dlp` is not installed here
+- confirm `--config-locations` behavior against the installed `yt-dlp` version during integration testing
+- current backend scaffold is not yet wired into `dockerfile` or `run-youdis.sh`
+- archive-hit and result-path parsing currently depend on `yt-dlp` stdout text patterns, so treat them as provisional until integration-tested
+
+* [ ] 2.0.2: update discord bot to use new backend (3)
+update the discord bot into a thin frontend that talks to the backend and verify the flow end to end
+** pm notes
+- this is the first real frontend proof. once this works cleanly, zulip/xmpp should mostly be adapter work rather than downloader rewrites.
+- keep discord logic thin; no auth
+- do not duplicate yt-dlp behavior in the bot
+
+** Acceptance Criteria
+1. discord bot submits requests to backend
+   - command/input handling works
+   - acceptance/busy/failure responses are clear
+2. discord bot relays useful backend status
+   - progress reporting works at a basic level
+   - completion/failure/skipped outcomes are surfaced
+3. backend-discord flow is tested end to end
+   - valid request path tested
+   - busy or conflict behavior tested
+   - failure path tested
+
+** evidence
+- commit:
+- tests:
+- datetime:
+
+** notes
+
+* [ ] 2.0.3: remove deprecated discord-bot functionality (2)
+delete or retire legacy bot behaviors that no longer fit once the backend split is in place
+** pm notes
+- only remove this after the new path works. this is cleanup, not pioneering work.
+- favor deletion over compatibility shims
+- keep operator controls only if still useful
+
+** Acceptance Criteria
+1. remove obsolete auth/user-management behavior
+   - old user persistence and commands are removed
+   - backend-facing flow no longer depends on them
+2. remove obsolete downloader/runtime logic from bot
+   - bot no longer owns yt-dlp execution
+   - dead code paths are deleted
+3. leave the bot in a coherent state
+   - remaining commands reflect actual supported behavior
+   - deprecated artifacts are clearly removed or marked
+
+** evidence
+- commit:
+- tests:
+- datetime:
+
+** notes
+
+* [ ] 2.0.5: fix automation and build pipeline (3)
+repair and simplify the build/update/deploy path so it matches the new backend-plus-frontend structure
+** pm notes
+- this should come after architecture and discord integration stabilize. no point polishing the pipeline for the wrong shape.
+- optimize for simple manual ops first
+- stop here after pipeline is sane
+  
+** Acceptance Criteria
+1. align build artifacts with the new structure
+   - docker/build scripts reflect current components
+   - runtime assumptions are consistent
+2. review old automation artifacts
+   - stale runner/update/restart logic is removed or updated
+   - manual update/rebuild flow is clear
+3. confirm deployment path works
+   - local or unraid deployment is validated
+   - pipeline is understandable enough to maintain
+
+** evidence
+- commit:
+- tests:
+- datetime:
+
+** notes
--- a/pm/tasks.org
+++ b/pm/tasks.org
@@ -0,0 +1,123 @@
+#+title: Youdis Task Log 
+#+updated: [2026-03-31 Tue 08:00]
+
+* [X] 1.1.1: stabilize youdis core bot behavior (estimate 3 commits)
+refactor the current `youdis.py` flow so authorization, download execution, and user feedback are correct and predictable without changing the product shape.  keep this narrowly scoped to correctness and maintainability; do not redesign into a queueing platform yet.  preserve archive-first behavior and dm status updates; do not add new infrastructure dependencies and prefer boring explicit state over clever concurrency.
+
+** acceptance criteria
+1. initialize and load `/config/users.json` safely in all cases
+   - create parent dirs before touch/open
+   - ensure `authorized_users` always has a valid default
+   - normalize stored ids to a single type
+2. fix command-path correctness for `/youtube`, `/adduser`, and `/removeuser`
+   - authorized users can successfully invoke downloads
+   - add/remove user commands persist changes correctly
+   - remove broken/incomplete code paths
+3. duplicate prevention relies on archive.txt
+
+** pm notes
+
+** evidence
+- commit: 033d9dd
+- tests: ~python3 -m py_compile ./youdis.py~
+- datetime: [2026-03-31 Tue 13:28]
+
+** notes
+- store Discord user ids as strings in `users.json`
+- duplicate prevention should continue to rely on `archive.txt`, not inferred hook errors
+
+* [X] 1.1.2: remove global mutable download state and define single-job semantics (estimate 2 commits)
+eliminate shared mutable hook state and make concurrent behavior explicit, even if the initial policy is just "one active job at a time."  don't build a scheduler; ok if simplest outcome is single active job with clear busy message.  cancellation can be coarse if yt-dlp/process boundaries make graceful stop annoying
+
+** acceptance criteria
+1. improve runtime handling for downloads
+   - replace brittle thread/join pattern with a simpler async-safe execution path
+   - catch and report real yt-dlp failures
+   - avoid misleading "already exists" error assumptions
+2. progress reporting is isolated per request
+   - no module-level mutable title state shared across jobs
+   - hooks derive state from request-local context
+3. active-job behavior is explicit
+   - either reject a second request while busy or implement a minimal tracked active job
+   - user-facing response explains current behavior
+4. `/interrupt` is either implemented minimally or downgraded honestly
+   - no fake command implying cancellation works when it does not
+   - command behavior matches implementation
+
+** evidence
+- commit: 667b06f
+- tests: ~python3 -m py_compile /home/user/proj/youdis/youdis.py~
+- datetime: [2026-03-31 Tue 14:00]
+
+** notes
+- verify slash-command response patterns against the `interactions` library while touching runtime flow
+
+* [ ] 1.1.3: move static yt-dlp behavior into config and shrink python surface area (estimate 2 commits)
+shift stable downloader options into `default-yt-dlp.conf` so the bot code only handles dynamic inputs and orchestration.  optimize for inspectability and low-friction manual ops.  keep output naming durable enough for plex/plain-file use.  avoid duplicating config values across code and conf.
+
+** acceptance criteria
+1. separate static vs dynamic yt-dlp options cleanly
+   - stable defaults live in `default-yt-dlp.conf`
+   - python injects only request-specific/runtime values
+2. preserve archive and output behavior
+   - `archive.txt` remains the duplicate-prevention mechanism
+   - output paths remain stable and browseable
+3. document config ownership
+   - clarify which settings belong in config vs code
+   - make future yt-dlp tuning possible without major python edits
+
+** evidence
+- commit:
+- tests:
+- datetime:
+
+** notes
+
+* [ ] 1.1.4: simplify image/build/update workflow around manual ops (estimate 3 commits)
+reduce repo cruft from the gitea-runner/nightly-update experiment and replace it with explicit manual update/rebuild mechanics.
+
+** acceptance criteria
+1. define a manual update path for yt-dlp and app image lifecycle
+   - document or script manual `git pull`, rebuild, and redeploy
+   - remove or quarantine brittle auto-update assumptions
+2. review and simplify `update-ytdlp.sh`, workflow yaml, and weekly restart artifacts
+   - keep only artifacts that serve the current manual-ops model
+   - delete or mark deprecated anything tied to abandoned automation paths
+3. retain unraid deployment viability
+   - container can still be rebuilt and redeployed cleanly on jeeves
+   - resulting flow is understandable without rereading old ci experiments
+
+- pm note: weekly restart is presumed suspect until proven necessary
+
+** evidence
+- commit:
+- tests:
+- datetime:
+
+** notes
+- do not let runner/workflow complexity dominate a small bot
+- prefer explicit version pinning or manual binary refresh over magical nightlies
+
+* [ ] 1.1.5: clean up packaging/deployment artifacts for unraid consumption (estimate 2 commits)
+make the dockerfile, run script, and unraid-ca template consistent with the refactored app so deployment is less of a ritual ordeal.
+
+** acceptance criteria
+1. align docker/runtime assumptions
+   - paths like `/config` and `/downloads` are consistent across code, scripts, and container metadata
+   - env vars are documented and validated
+2. review deployment artifacts for drift
+   - `dockerfile`, `run-youdis.sh`, and `unraid-ca-template.xml` reflect current behavior
+   - remove stale references and dead assumptions
+3. make fresh deployment understandable
+   - a new deploy on unraid is possible without reconstructing tribal knowledge from old files
+
+- pm note: this is packaging polish after core correctness, not before
+
+** evidence
+- commit:
+- tests:
+- datetime:
+
+** notes
+- keep container surface area small
+- optimize for “future me can redeploy this without cursing past me too hard”
--- a/youdis.py
+++ b/youdis.py
@@ -16,43 +16,100 @@ import asyncio
 import threading

 userFile = Path('/config/users.json')
-userFile.touch(exist_ok=True)
+userFile.parent.mkdir(exist_ok=True, parents=True)

 bot = interactions.Client(intents=interactions.Intents.DEFAULT,default_scope=2147491904)

-userFile.parent.mkdir(exist_ok=True, parents=True)
-try:
-    with open(userFile, 'x') as f:
-        print(f'users.json not found; saving to {userFile}')    
-except FileExistsError:
-    with open(userFile, 'r') as f:
-        authorized_users = json.load(f).get('authorized_users')
-    print(f'authorized_users:{authorized_users}')
+def save_authorized_users(authorized_users):
+    with open(userFile, 'w') as f:
+        json.dump({'authorized_users': authorized_users}, f)

-title = ''
+def load_authorized_users():
+    if not userFile.exists():
+        save_authorized_users([])
+        print(f'users.json not found; saving to {userFile}')
+        return []
+
+    try:
+        with open(userFile, 'r') as f:
+            data = json.load(f)
+    except (json.JSONDecodeError, OSError):
+        save_authorized_users([])
+        print(f'users.json invalid; resetting {userFile}')
+        return []
+
+    authorized_users = data.get('authorized_users', [])
+    if not isinstance(authorized_users, list):
+        authorized_users = []
+
+    authorized_users = [str(user_id) for user_id in authorized_users]
+    save_authorized_users(authorized_users)
+    print(f'authorized_users:{authorized_users}')
+    return authorized_users
+
+authorized_users = load_authorized_users()
+
+active_job_lock = threading.Lock()
+active_job = None

 async def send_message(ctx, message):
    await ctx.author.send(message)

+def claim_active_job(job):
+    global active_job
+    with active_job_lock:
+        if active_job is not None:
+            return active_job
+        active_job = job
+        return None
+
+def get_active_job():
+    with active_job_lock:
+        return active_job
+
+def clear_active_job(job):
+    global active_job
+    with active_job_lock:
+        if active_job is job:
+            active_job = None
+
 def download_video(url, options):
    with yt_dlp.YoutubeDL(options) as ydl:
        ydl.download(url)

-def create_hook(ctx,loop):
+def create_hook(ctx, loop, cancel_event):
+    seen_updates = set()
+
    def hook(d):
-        global title
+        if cancel_event.is_set():
+            raise yt_dlp.utils.DownloadCancelled('download canceled by /interrupt')
+
        status = d.get('status')
-        if status == 'error':
-            msg = f'error; video probably already exists, have you checked archive.txt'
-            asyncio.run_coroutine_threadsafe(send_message(ctx,msg),loop)
-        elif d.get('info_dict').get('title') != title:
-            title = d.get('info_dict').get('title')
-            playlist_index = d.get('info_dict').get('playlist_index')
-            playlist_count = d.get('info_dict').get('playlist_count')
-            filename = d.get('filename')
-            url = d.get('info_dict').get('webpage_url')
-            msg = f'{status} {playlist_index} of {playlist_count}: {filename} <{url}>'
-            asyncio.run_coroutine_threadsafe(send_message(ctx,msg),loop)
+        info = d.get('info_dict') or {}
+
+        if status not in {'downloading', 'finished'}:
+            return
+
+        filename = d.get('filename') or info.get('_filename') or info.get('title')
+        update_key = (status, filename)
+        if update_key in seen_updates:
+            return
+
+        seen_updates.add(update_key)
+        playlist_index = info.get('playlist_index')
+        playlist_count = info.get('playlist_count')
+        url = info.get('webpage_url')
+
+        prefix = status
+        if playlist_index and playlist_count:
+            prefix = f'{status} {playlist_index} of {playlist_count}'
+
+        msg = f'{prefix}: {filename}'
+        if url:
+            msg = f'{msg} <{url}>'
+
+        asyncio.run_coroutine_threadsafe(send_message(ctx, msg), loop)
+
    return hook

@interactions.slash_command(name="youtube",description="download video from youtube to server")
@@ -64,9 +121,29 @@ def create_hook(ctx,loop):
 )
 async def youtube(ctx: interactions.SlashContext, url:str):
    print(f'{ctx.author.id} requested {url}')
+    # check that user is authorized
+    if str(ctx.author.id) not in authorized_users:
+        if ctx.author.id == 127831327012683776:
+            await ctx.author.send('potato stop')
+        await ctx.author.send('you are not authorized to use this command. message my owner to be added.')
+        return
+
    loop = asyncio.get_running_loop()
-    hook = create_hook(ctx,loop)
-    msg = ''
+    cancel_event = threading.Event()
+    hook = create_hook(ctx, loop, cancel_event)
+    job = {
+        'requester_id': str(ctx.author.id),
+        'request_url': url,
+        'cancel_event': cancel_event,
+    }
+    existing_job = claim_active_job(job)
+    if existing_job:
+        await ctx.author.send(
+            f'already downloading for <@{existing_job["requester_id"]}>. '
+            'single-job mode is enabled right now; try again after it finishes.'
+        )
+        return
+
    # use api_to_cli and paste cli options to get the output you need
    yoptions = {
        'format':'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best',
@@ -82,35 +159,39 @@ async def youtube(ctx: interactions.SlashContext, url:str):
        'outtmpl': '%(uploader)s/%(playlist_title)s/%(playlist_index)s%(playlist_index& - )s%(title)s.%(ext)s',
        'outtmpl_na_placeholder':'',
    }
-    # check that user is authorized
-    if ctx.author.id not in authorized_users:
-        if ctx.author.id == 127831327012683776:
-            await ctx.author.send('potato stop')
-        await ctx.author.send('you are not authorized to use this command. message my owner to be added.')
-        return
+    await ctx.channel.send(f'Downloading from <{url}>. Status updates via DM. Single-job mode is enabled.')
+
+    try:
+        await asyncio.to_thread(download_video, url, yoptions)
+    except yt_dlp.utils.DownloadCancelled as exc:
+        print(f'download canceled: {exc}')
+        await ctx.author.send(f'download canceled: {exc}')
+    except yt_dlp.utils.DownloadError as exc:
+        print(f'download failed: {exc}')
+        await ctx.author.send(f'download failed: {exc}')
+    except Exception as exc:
+        print(f'unexpected download failure: {exc}')
+        await ctx.author.send(f'unexpected download failure: {exc}')
    else:
-        await ctx.channel.send(f'Downloading from <{url}>. Status updates via DM.')
-        #await ctx.defer()  #if you need up to 15m to respond
-
-        # 1/2 - download in separate thread, else progress_hook blocks downstream async ctx.send
-        download_thread = threading.Thread(target=download_video, args=(url,yoptions))
-        download_thread.start()
-        await asyncio.to_thread(download_thread.join)
-
-        # 2/2 - replace the above with this next try:
-        #try:
-        #    await asyncio.to_thread(download_video, url, yoptions)
-        #except Exception as e:
-        #    print(f"download failed: {e}")
-        #    await ctx.author.send(f"download failed: {str(e)}")
+        await ctx.author.send(f'download complete for <{url}>')
+    finally:
+        clear_active_job(job)


@interactions.slash_command(name="interrupt",description="cancel current job")
@interactions.check(interactions.is_owner())
 async def _interrupt(ctx):
-    # interrupt here
-    print('interrupting current job - not implemented')
-    await ctx.author.send('interrupting current job - not implemented')
+    job = get_active_job()
+    if not job:
+        await ctx.author.send('no active download to interrupt')
+        return
+
+    job['cancel_event'].set()
+    print(f'interrupt requested for {job["request_url"]}')
+    await ctx.author.send(
+        f'interrupt requested for <{job["request_url"]}>; '
+        'cancellation is coarse and will stop on the next yt-dlp progress update'
+    )
    
@interactions.slash_command(name="adduser",description="authorize target user")
@interactions.slash_option(
@@ -121,12 +202,14 @@ async def _interrupt(ctx):
 )
@interactions.check(interactions.is_owner())
 async def _adduser(ctx: interactions.SlashContext, user:interactions.OptionType.USER):
-    if str(user.id) not in authorized_users:
-        authorized_users.append(str(user.id))
-        with open(userFile,'w') as f:  #overwrite file - fix later if other params come up
-            json.dump({'authorized_users':authorized_users})
-        print('react:checkmark')
-        await ctx.message.add_reaction('✅')
+    user_id = str(user.id)
+    if user_id not in authorized_users:
+        authorized_users.append(user_id)
+        save_authorized_users(authorized_users)
+        print(f'authorized {user_id}')
+        await ctx.author.send(f'authorized {user.mention}')
+    else:
+        await ctx.author.send(f'{user.mention} is already authorized')

@interactions.slash_command(name="removeuser",description="deauthorize target user")
@interactions.slash_option(
@@ -137,19 +220,14 @@ async def _adduser(ctx: interactions.SlashContext, user:interactions.OptionType.
 )
@interactions.check(interactions.is_owner())
 async def _removeuser(ctx: interactions.SlashContext, user:interactions.OptionType.USER):
-    if str(user.id) in authorized_users:
-    # ? ? ? fix pls
-        i = index(authorized_users(str(user.id)))
-
-        # update list, rewrite json
-            
-        print('react:checkmark')
-        await ctx.message.add_reaction('✅')
-
-async def dl_hook(d):
-    msg = f'{d["status"]} {d["filename"]}'
-    print(msg)
-    await ctx.author.send(msg)
+    user_id = str(user.id)
+    if user_id in authorized_users:
+        authorized_users.remove(user_id)
+        save_authorized_users(authorized_users)
+        print(f'deauthorized {user_id}')
+        await ctx.author.send(f'deauthorized {user.mention}')
+    else:
+        await ctx.author.send(f'{user.mention} is not currently authorized')

 api_token = getenv('api_token')
 if not api_token:
--- a/youdis/init.py
+++ b/youdis/init.py
@@ -0,0 +1 @@
+"""Youdis v2 backend package."""
--- a/youdis/adapters/init.py
+++ b/youdis/adapters/init.py
@@ -0,0 +1 @@
+"""Frontend adapters for the youdis backend."""
--- a/youdis/adapters/discord.py
+++ b/youdis/adapters/discord.py
@@ -0,0 +1 @@
+"""Discord adapter placeholder for the v2 backend."""
--- a/youdis/main.py
+++ b/youdis/main.py
@@ -0,0 +1,255 @@
+import asyncio
+from asyncio.subprocess import PIPE, STDOUT
+from collections import deque
+from dataclasses import dataclass, field
+from datetime import datetime
+from os import getenv
+from pathlib import Path
+from uuid import uuid4
+
+from fastapi import FastAPI, HTTPException
+
+from .models import CurrentJobResponse, HealthResponse, JobRequest, JobStatus, VersionResponse
+
+
+REPO_ROOT = Path(__file__).resolve().parent.parent
+DEFAULT_CONFIG = REPO_ROOT / "default-yt-dlp.conf"
+VERSION_FILE = REPO_ROOT / "youdis-version.txt"
+YTDLP_EXECUTABLE = getenv("YOUDIS_YTDLP_EXECUTABLE", "yt-dlp")
+
+
+@dataclass
+class ManagedJob:
+    status: JobStatus
+    process: asyncio.subprocess.Process | None = None
+    task: asyncio.Task | None = None
+    cancel_requested: bool = False
+    output_lines: deque[str] = field(default_factory=lambda: deque(maxlen=25))
+
+
+app = FastAPI(title="youdis", version="2")
+job_lock = asyncio.Lock()
+active_job: ManagedJob | None = None
+last_job: JobStatus | None = None
+
+
+def now_utc() -> datetime:
+    return datetime.now()
+
+
+def read_version() -> str:
+    if VERSION_FILE.exists():
+        return VERSION_FILE.read_text().strip()
+    return "unknown"
+
+
+def build_ytdlp_command(request: JobRequest) -> list[str]:
+    return [
+        YTDLP_EXECUTABLE,
+        "--config-locations",
+        str(DEFAULT_CONFIG),
+        request.url,
+    ]
+
+
+def clone_status(status: JobStatus) -> JobStatus:
+    return JobStatus(**status.model_dump())
+
+
+def update_status(job: ManagedJob, **changes: object) -> None:
+    for key, value in changes.items():
+        setattr(job.status, key, value)
+    job.status.updated_at = now_utc()
+
+
+def classify_output_line(job: ManagedJob, line: str) -> None:
+    if not line:
+        return
+
+    job.output_lines.append(line)
+    message = line.strip()
+    if not message:
+        return
+
+    lowered = message.lower()
+    if "has already been recorded in the archive" in lowered:
+        update_status(
+            job,
+            disposition="archive_hit",
+            phase="downloading",
+            message="already in archive",
+        )
+        return
+
+    if "[download]" in lowered:
+        update_status(job, phase="downloading", message=message)
+        if "destination:" in lowered:
+            result_path = message.split(":", 1)[1].strip()
+            update_status(job, result_path=result_path)
+        return
+
+    if "merging formats into" in lowered:
+        result_path = message.split("into", 1)[1].strip().strip('"')
+        update_status(job, phase="postprocessing", message=message, result_path=result_path)
+        return
+
+    update_status(job, message=message)
+
+
+async def finalize_job(job: ManagedJob, returncode: int) -> None:
+    disposition = job.status.disposition
+    if job.cancel_requested:
+        state = "cancelled"
+        message = "cancelled"
+    elif returncode == 0 and disposition == "archive_hit":
+        state = "completed"
+        message = "already in archive"
+    elif returncode == 0:
+        state = "completed"
+        message = job.status.message or "completed"
+    else:
+        state = "failed"
+        message = job.status.message or f"yt-dlp exited with code {returncode}"
+
+    update_status(job, state=state, phase=None, returncode=returncode, message=message)
+
+    global active_job, last_job
+    async with job_lock:
+        if active_job is job:
+            active_job = None
+        last_job = clone_status(job.status)
+
+
+async def run_job(job: ManagedJob, request: JobRequest) -> None:
+    command = build_ytdlp_command(request)
+    update_status(
+        job,
+        state="running",
+        phase="starting",
+        message="starting yt-dlp",
+        command=command,
+    )
+
+    try:
+        if not DEFAULT_CONFIG.exists():
+            update_status(
+                job,
+                state="failed",
+                phase=None,
+                message=f"default config not found: {DEFAULT_CONFIG}",
+                returncode=78,
+            )
+            await finalize_job(job, 78)
+            return
+
+        process = await asyncio.create_subprocess_exec(
+            *command,
+            stdout=PIPE,
+            stderr=STDOUT,
+        )
+    except FileNotFoundError:
+        update_status(
+            job,
+            state="failed",
+            phase=None,
+            message="yt-dlp executable not found",
+            returncode=127,
+        )
+        await finalize_job(job, 127)
+        return
+
+    try:
+        job.process = process
+        update_status(job, phase="running", message="yt-dlp running")
+
+        assert process.stdout is not None
+        while True:
+            line = await process.stdout.readline()
+            if not line:
+                break
+            classify_output_line(job, line.decode(errors="replace").strip())
+
+        returncode = await process.wait()
+        await finalize_job(job, returncode)
+    except Exception as exc:
+        update_status(
+            job,
+            state="failed",
+            phase=None,
+            message=f"backend runner error: {exc}",
+            returncode=1,
+        )
+        await finalize_job(job, 1)
+
+
+async def create_job(request: JobRequest) -> JobStatus:
+    global active_job
+    async with job_lock:
+        if active_job is not None:
+            busy_job = active_job.status
+            return JobStatus(
+                job_id=busy_job.job_id,
+                state="busy",
+                url=request.url,
+                message=f"busy with {busy_job.url}",
+                requester_id=request.requester_id,
+                requester_name=request.requester_name,
+                origin=request.origin,
+            )
+
+        status = JobStatus(
+            job_id=str(uuid4()),
+            state="accepted",
+            url=request.url,
+            message="accepted",
+            phase="queued",
+            requester_id=request.requester_id,
+            requester_name=request.requester_name,
+            origin=request.origin,
+        )
+        managed_job = ManagedJob(status=status)
+        managed_job.task = asyncio.create_task(run_job(managed_job, request))
+        active_job = managed_job
+        return clone_status(status)
+
+
+@app.get("/health", response_model=HealthResponse)
+async def health() -> HealthResponse:
+    return HealthResponse(status="ok")
+
+
+@app.get("/version", response_model=VersionResponse)
+async def version() -> VersionResponse:
+    return VersionResponse(version=read_version(), active_job=active_job is not None)
+
+
+@app.get("/jobs/current", response_model=CurrentJobResponse)
+async def current_job() -> CurrentJobResponse:
+    async with job_lock:
+        if active_job is not None:
+            return CurrentJobResponse(active=True, job=clone_status(active_job.status))
+        if last_job is not None:
+            return CurrentJobResponse(active=False, job=clone_status(last_job))
+    return CurrentJobResponse(active=False, job=None)
+
+
+@app.post("/jobs", response_model=JobStatus)
+async def submit_job(request: JobRequest) -> JobStatus:
+    return await create_job(request)
+
+
+@app.post("/jobs/current/cancel", response_model=JobStatus)
+async def cancel_current_job() -> JobStatus:
+    async with job_lock:
+        if active_job is None:
+            raise HTTPException(status_code=404, detail="no active job")
+        job = active_job
+        if job.process is None:
+            update_status(job, message="cancel requested before yt-dlp started")
+            job.cancel_requested = True
+            return clone_status(job.status)
+
+        job.cancel_requested = True
+        update_status(job, message="cancel requested", phase="running")
+        job.process.terminate()
+        return clone_status(job.status)
--- a/youdis/models.py
+++ b/youdis/models.py
@@ -0,0 +1,46 @@
+from datetime import datetime
+from typing import Literal
+
+from pydantic import BaseModel, Field
+
+
+JobState = Literal["accepted", "busy", "running", "completed", "failed", "cancelled"]
+
+
+class JobRequest(BaseModel):
+    url: str
+    requester_id: str | None = None
+    requester_name: str | None = None
+    origin: str | None = None
+    requested_at: datetime | None = None
+
+
+class JobStatus(BaseModel):
+    job_id: str
+    state: JobState
+    url: str
+    message: str | None = None
+    phase: str | None = None
+    disposition: str | None = None
+    requester_id: str | None = None
+    requester_name: str | None = None
+    origin: str | None = None
+    result_path: str | None = None
+    command: list[str] = Field(default_factory=list)
+    returncode: int | None = None
+    created_at: datetime = Field(default_factory=datetime.utcnow)
+    updated_at: datetime = Field(default_factory=datetime.utcnow)
+
+
+class CurrentJobResponse(BaseModel):
+    active: bool
+    job: JobStatus | None = None
+
+
+class HealthResponse(BaseModel):
+    status: Literal["ok"]
+
+
+class VersionResponse(BaseModel):
+    version: str
+    active_job: bool
Author	SHA1	Message	Date
ben	2a5648506e	initial ytdlp worker	2026-03-31 20:54:56 -04:00
ben	7926534e54	initial ytdlp worker	2026-03-31 20:54:23 -04:00
ben	0ed16eca62	finalized v2 architecture for 2.0.1	2026-03-31 18:48:10 -04:00
ben	35a2592dce	draft arch-v2	2026-03-31 16:53:28 -04:00
eulaly	5f78ac6a1a	added v2 tasks	2026-03-31 16:23:37 -04:00
eulaly	2beb54f219	udpated pm note format	2026-03-31 16:22:46 -04:00
eulaly	f546c9fbb7	update 1.1.2	2026-03-31 14:01:25 -04:00
eulaly	667b06fe4a	enforce single-job blocking	2026-03-31 13:45:17 -04:00
eulaly	033d9dd167	load authorized_users.json robustly	2026-03-31 13:24:25 -04:00
eulaly	7a97e1f23d	updated gitignore with template	2026-03-31 10:44:06 -04:00
				`@@ -0,0 +1 @@`
				`"""Frontend adapters for the youdis backend."""`
				`@@ -0,0 +1 @@`
				`"""Discord adapter placeholder for the v2 backend."""`