diff --git a/README.md b/README.md index d830954..28a6542 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -v2 architecture draft: see `docs/architecture-v2.md` +v2 architecture draft: see `docs/architecture-v2.org` build and run the docker container ``` diff --git a/docs/architecture-v2.org b/docs/architecture-v2.org index c1aa4bf..671dec5 100644 --- a/docs/architecture-v2.org +++ b/docs/architecture-v2.org @@ -1,6 +1,6 @@ #+TITLE: youdis v2 architecture -#+VERSION: 1 #+DATE: [2026-03-31 Tue] +#+VERSION: 3 * Purpose Youdis v2 splits the current monolithic Discord bot into: @@ -27,8 +27,8 @@ The goal is to make downloader behavior inspectable, reusable, and easier to deb Owns: - request validation - single active job state -- yt-dlp configuration loading and merge rules -- download execution +- yt-dlp executable invocation +- yt-dlp configuration selection and runtime argument construction - archive behavior - output path behavior - progress and final result events @@ -63,6 +63,20 @@ V2 should start as a single private process boundary with two layers: The backend is intended for private/local deployment. Trust comes from deployment topology and network placement, not backend auth. +** Execution model +The backend is a thin FastAPI wrapper around the `yt-dlp` executable, not a deep Python integration, unless proven necessary. + +Why this is the default: +- keeps behavior closer to native `yt-dlp` +- makes container debugging easier because commands can be reproduced manually +- avoids reimplementing `yt-dlp` option parsing unless we truly need tighter integration + +This means the backend is primarily responsible for: +- deciding when `yt-dlp` may run +- constructing a safe effective command +- supervising the subprocess while it is active +- translating process outcomes into a small API contract + ** Backend Contract *** Request model Minimal fields: @@ -74,17 +88,26 @@ Minimal fields: Only `url` affects downloader behavior in v2. The rest is passthrough metadata for logs and frontend context. *** Result states -The backend should normalize job outcomes into a small stable vocabulary: +The backend should keep its external state model deliberately small: - `accepted` - `busy` -- `validating` -- `downloading` -- `skipped` +- `running` - `completed` - `failed` - `cancelled` -These states should be transport-agnostic and human-debuggable. +These states should be transport-agnostic, human-debuggable, and only distinct when a frontend would behave differently. + +Internal phases and details may be richer, but they should usually be carried as metadata rather than promoted to top-level states. + +Examples of detail fields that can travel alongside the small state model: +- `phase=validating` +- `phase=downloading` +- `disposition=archive_hit` +- `message=already in archive` +- `result_path=/downloads/...` + +This keeps the API simple while still leaving room for useful operator-facing detail. *** Minimal endpoints The first backend seam should stay small: @@ -101,7 +124,7 @@ V2 explicitly supports one active job at a time. - If idle, the backend accepts a new job and marks it active. - If busy, the backend rejects the new request with a `busy` result. - Cancel is coarse and best-effort. -- The backend clears active state when the job reaches `skipped`, `completed`, `failed`, or `cancelled`. +- The backend clears active state when the job reaches `completed`, `failed`, or `cancelled`. This is a feature, not a limitation, for the initial service. @@ -109,7 +132,7 @@ This is a feature, not a limitation, for the initial service. This is the most important v2 cleanup. -The backend owns all yt-dlp configuration behavior. Frontends must not build yt-dlp option dictionaries. +The backend owns all yt-dlp invocation behavior. Frontends must not build `yt-dlp` commands or option dictionaries. ** Config split @@ -122,13 +145,13 @@ Static settings belong in `default-yt-dlp.conf`: Runtime settings belong in backend code: - target URL -- progress hooks -- cancellation hook -- request-scoped metadata +- config-file path, if explicitly set at launch +- request-scoped flags or overrides +- subprocess lifecycle and cancellation behavior - any values that must vary per request *** Merge rule -The backend should load the default config first, then apply a small set of explicit runtime overrides. +The backend should invoke `yt-dlp` with the default config first, then apply a small set of explicit runtime overrides. If a runtime override conflicts with file-backed config, the override wins and the backend should log the effective value used. @@ -136,8 +159,7 @@ If a runtime override conflicts with file-backed config, the override wins and t For each job, the backend should make it easy to inspect: - config file path used -- whether config loading succeeded -- effective key yt-dlp options after merge +- effective `yt-dlp` command or key override arguments - final normalized job result This exists specifically because the current config behavior has been difficult to reason about. @@ -157,25 +179,45 @@ youdis/ default-yt-dlp.conf youdis/ __init__.py - api.py - config.py + main.py models.py - worker.py - ytdlp_backend.py adapters/ __init__.py discord.py docs/ - architecture-v2.md + architecture-v2.org #+end_quote Notes: -- `ytdlp_backend.py` owns yt-dlp integration and result normalization. -- `worker.py` owns active-job state and lifecycle. -- `api.py` exposes the minimal HTTP seam. +- `main.py` owns the FastAPI app, active-job state, and `yt-dlp` subprocess lifecycle. +- `models.py` owns request and response models. - `adapters/discord.py` becomes a thin client of the backend. +** Example response shape + +One likely response shape for v1: + +#+begin_src json +{ + "state": "completed", + "disposition": "archive_hit", + "message": "already in archive", + "job_id": "abc123" +} +#+end_src + +And for an active job: + +#+begin_src json +{ + "state": "running", + "phase": "downloading", + "message": "downloading 3 of 9", + "job_id": "abc123" +} +#+end_src + ** Framework Recommendation FastAPI is the current recommendation for the backend seam. @@ -200,7 +242,7 @@ Why this is not a strong ideological choice: - single active-job state holder - one submit-job path - one current-job status path -- yt-dlp config loading from `default-yt-dlp.conf` -- explicit runtime overrides for request URL, progress hooks, and cancel behavior +- `yt-dlp` subprocess invocation using `default-yt-dlp.conf` +- explicit runtime overrides for request URL and cancel behavior The Discord adapter should remain unchanged until that backend skeleton can accept a job and report status coherently. diff --git a/pm/arch-v2.org b/pm/arch-v2.org deleted file mode 100644 index c1aa4bf..0000000 --- a/pm/arch-v2.org +++ /dev/null @@ -1,206 +0,0 @@ -#+TITLE: youdis v2 architecture -#+VERSION: 1 -#+DATE: [2026-03-31 Tue] - -* Purpose -Youdis v2 splits the current monolithic Discord bot into: -- a private backend worker service that owns yt-dlp execution -- thin chat adapters that translate user actions into backend requests - -The goal is to make downloader behavior inspectable, reusable, and easier to debug across multiple frontends without introducing queueing infrastructure or complex deployment. - -** Goals -- Keep the backend transport-agnostic. -- Preserve archive-first duplicate prevention via `archive.txt`. -- Keep single-job semantics explicit. -- Treat requester and origin metadata as informational, not authorization. -- Make yt-dlp configuration understandable and debuggable. - -** Non-Goals -- No backend auth or user membership logic. -- No database, Redis, or external queue. -- No multi-worker scheduling in v2. -- No frontend-owned downloader logic. - -* Proposed Components -** Backend worker service -Owns: -- request validation -- single active job state -- yt-dlp configuration loading and merge rules -- download execution -- archive behavior -- output path behavior -- progress and final result events - -Does not own: -- Discord slash-command parsing -- chat formatting decisions -- access control policy - -** Frontend adapters -1. Discord -2. Zulip -3. XMPP - -Own: -- command parsing -- user-facing message formatting -- transport-specific reply behavior -- optional frontend-side gating if the deployment wants it - -Do not own: -- yt-dlp option construction -- archive checks -- output path logic -- job lifecycle rules - -* Initial Deployment Shape - -V2 should start as a single private process boundary with two layers: -1. backend service process -2. one Discord adapter process - -The backend is intended for private/local deployment. Trust comes from deployment topology and network placement, not backend auth. - -** Backend Contract -*** Request model -Minimal fields: -- `url` -- `requester_id` optional -- `requester_name` optional -- `origin` optional -- `requested_at` optional -Only `url` affects downloader behavior in v2. The rest is passthrough metadata for logs and frontend context. - -*** Result states -The backend should normalize job outcomes into a small stable vocabulary: -- `accepted` -- `busy` -- `validating` -- `downloading` -- `skipped` -- `completed` -- `failed` -- `cancelled` - -These states should be transport-agnostic and human-debuggable. - -*** Minimal endpoints -The first backend seam should stay small: -- `POST /jobs` -- `GET /jobs/current` -- `POST /jobs/current/cancel` -- `GET /health` -- `GET /version` - -Polling is the default integration model for v2. Streaming or push delivery is deferred unless a real frontend need appears. - -** Single-Job Semantics -V2 explicitly supports one active job at a time. -- If idle, the backend accepts a new job and marks it active. -- If busy, the backend rejects the new request with a `busy` result. -- Cancel is coarse and best-effort. -- The backend clears active state when the job reaches `skipped`, `completed`, `failed`, or `cancelled`. - -This is a feature, not a limitation, for the initial service. - -* yt-dlp Configuration Ownership - -This is the most important v2 cleanup. - -The backend owns all yt-dlp configuration behavior. Frontends must not build yt-dlp option dictionaries. - -** Config split - -Static settings belong in `default-yt-dlp.conf`: -- archive path -- output template -- embedding and metadata preferences -- stable format defaults -- retry defaults - -Runtime settings belong in backend code: -- target URL -- progress hooks -- cancellation hook -- request-scoped metadata -- any values that must vary per request - -*** Merge rule -The backend should load the default config first, then apply a small set of explicit runtime overrides. - -If a runtime override conflicts with file-backed config, the override wins and the backend should log the effective value used. - -*** Debuggability requirement -For each job, the backend should make it easy to inspect: - -- config file path used -- whether config loading succeeded -- effective key yt-dlp options after merge -- final normalized job result - -This exists specifically because the current config behavior has been difficult to reason about. - -*** Config hygiene expectations - -The default config should be safe for real downloads in production-like use. - -That means test-only settings such as `--simulate` should not remain enabled in the default runtime path unless the backend intentionally supports an explicit dry-run mode. - -** Recommended Repo Shape -One reasonable first cut: - -#+begin_quote -youdis/ - README.md - default-yt-dlp.conf - youdis/ - __init__.py - api.py - config.py - models.py - worker.py - ytdlp_backend.py - adapters/ - __init__.py - discord.py - docs/ - architecture-v2.md -#+end_quote - -Notes: - -- `ytdlp_backend.py` owns yt-dlp integration and result normalization. -- `worker.py` owns active-job state and lifecycle. -- `api.py` exposes the minimal HTTP seam. -- `adapters/discord.py` becomes a thin client of the backend. - -** Framework Recommendation - -FastAPI is the current recommendation for the backend seam. - -Why it fits: - -- typed request and response models help define the contract cleanly -- built-in docs make local inspection easier during early iterations -- it stays small enough for this service if we keep the surface area disciplined - -Why this is not a strong ideological choice: - -- Flask would also work if we decide the type/model ergonomics are not worth it -- the important decision is keeping the seam small, not choosing a fashionable framework - -** Immediate Next Task - -`2.0.1` should implement the backend skeleton with just enough real behavior to prove the seam: - -- package/module layout -- health and version endpoint -- single active-job state holder -- one submit-job path -- one current-job status path -- yt-dlp config loading from `default-yt-dlp.conf` -- explicit runtime overrides for request URL, progress hooks, and cancel behavior - -The Discord adapter should remain unchanged until that backend skeleton can accept a job and report status coherently. diff --git a/pm/tasks-v2.org b/pm/tasks-v2.org index 436ee49..378ffe6 100644 --- a/pm/tasks-v2.org +++ b/pm/tasks-v2.org @@ -1,139 +1,139 @@ -#+title: Task Log -#+updated: [2026-03-31 Tue 16:03] - -* youdis v2 goals: -1. Separate backend from frontend -2. Offload auth -3. Ensure auto nightly builds -4. Default output format supports plex browsing youtube channels as "tv shows" -5. Facilitate multiple GUI inbounds: Discord, Zulip, XMPP - -* [ ] 2.0.0: define architecture (2-4) -define the target architecture for a private backend yt-dlp worker with thin chat frontends -** pm notes: -- keep this iterative. the point is to choose the shape and seam, not prematurely implement infra. likely decisions include backend framework, request/status model, and how thin the discord shim should be. -- goal is simple, private, maintainable deployment -- avoid auth, queues, or persistence beyond clear & immediate needs - -** Acceptance Criteria -1. document the target architecture at a high level - - backend owns yt-dlp execution and job state - - frontends own chat-specific UX -2. identify key decisions still open - - backend choice - - service seam/endpoints - - status/progress model -3. capture enough structure to begin implementation - - repo/component layout is sketched - - next implementation task is unblocked - -** evidence -- commit: -- tests: -- datetime: - +#+title: Task Log +#+updated: [2026-03-31 Tue 16:03] + +* youdis v2 goals: +1. Separate backend from frontend +2. Offload auth +3. Ensure auto nightly builds +4. Default output format supports plex browsing youtube channels as "tv shows" +5. Facilitate multiple GUI inbounds: Discord, Zulip, XMPP + +* [ ] 2.0.0: define architecture (2-4) +define the target architecture for a private backend yt-dlp worker with thin chat frontends +** pm notes: +- keep this iterative. the point is to choose the shape and seam, not prematurely implement infra. likely decisions include backend framework, request/status model, and how thin the discord shim should be. +- goal is simple, private, maintainable deployment +- avoid auth, queues, or persistence beyond clear & immediate needs + +** Acceptance Criteria +1. document the target architecture at a high level + - backend owns yt-dlp execution and job state + - frontends own chat-specific UX +2. identify key decisions still open + - backend choice + - service seam/endpoints + - status/progress model +3. capture enough structure to begin implementation + - repo/component layout is sketched + - next implementation task is unblocked + +** evidence +- commit: +- tests: +- datetime: + +** notes +- first architecture draft captured in `docs/architecture-v2.org` + +* [ ] 2.0.1: build backend yt-dlp worker (3) +create the minimal backend/service skeleton and establish a working yt-dlp baseline with clean hooks for future frontends +** pm notes +- foundation; don't need the full finished service here, just the basic shape plus enough real yt-dlp execution to validate the seam and build on it. +- keep single-job semantics +- prioritize inspectable behavior over polish + +** Acceptance Criteria +1. create the backend/service skeleton + - app/module layout exists + - core request path is stubbed or minimally working +2. establish a working yt-dlp baseline + - archive behavior is preserved + - output path behavior is preserved or intentionally updated + - use yt-dlp .conf and set reasonable default +3. expose basic hooks/interfaces for future frontends + - submit/request path exists + - status/progress hook exists + - basic health/version visibility exists + +** evidence +- commit: +- tests: +- datetime: + +** notes + +* [ ] 2.0.2: update discord bot to use new backend (3) +update the discord bot into a thin frontend that talks to the backend and verify the flow end to end +** pm notes +- this is the first real frontend proof. once this works cleanly, zulip/xmpp should mostly be adapter work rather than downloader rewrites. +- keep discord logic thin; no auth +- do not duplicate yt-dlp behavior in the bot + +** Acceptance Criteria +1. discord bot submits requests to backend + - command/input handling works + - acceptance/busy/failure responses are clear +2. discord bot relays useful backend status + - progress reporting works at a basic level + - completion/failure/skipped outcomes are surfaced +3. backend-discord flow is tested end to end + - valid request path tested + - busy or conflict behavior tested + - failure path tested + +** evidence +- commit: +- tests: +- datetime: + +** notes + +* [ ] 2.0.3: remove deprecated discord-bot functionality (2) +delete or retire legacy bot behaviors that no longer fit once the backend split is in place +** pm notes +- only remove this after the new path works. this is cleanup, not pioneering work. +- favor deletion over compatibility shims +- keep operator controls only if still useful + +** Acceptance Criteria +1. remove obsolete auth/user-management behavior + - old user persistence and commands are removed + - backend-facing flow no longer depends on them +2. remove obsolete downloader/runtime logic from bot + - bot no longer owns yt-dlp execution + - dead code paths are deleted +3. leave the bot in a coherent state + - remaining commands reflect actual supported behavior + - deprecated artifacts are clearly removed or marked + +** evidence +- commit: +- tests: +- datetime: + +** notes + +* [ ] 2.0.5: fix automation and build pipeline (3) +repair and simplify the build/update/deploy path so it matches the new backend-plus-frontend structure +** pm notes +- this should come after architecture and discord integration stabilize. no point polishing the pipeline for the wrong shape. +- optimize for simple manual ops first +- stop here after pipeline is sane + +** Acceptance Criteria +1. align build artifacts with the new structure + - docker/build scripts reflect current components + - runtime assumptions are consistent +2. review old automation artifacts + - stale runner/update/restart logic is removed or updated + - manual update/rebuild flow is clear +3. confirm deployment path works + - local or unraid deployment is validated + - pipeline is understandable enough to maintain + +** evidence +- commit: +- tests: +- datetime: + ** notes -- first architecture draft captured in `docs/architecture-v2.md` - -* [ ] 2.0.1: build backend yt-dlp worker (3) -create the minimal backend/service skeleton and establish a working yt-dlp baseline with clean hooks for future frontends -** pm notes -- foundation; don't need the full finished service here, just the basic shape plus enough real yt-dlp execution to validate the seam and build on it. -- keep single-job semantics -- prioritize inspectable behavior over polish - -** Acceptance Criteria -1. create the backend/service skeleton - - app/module layout exists - - core request path is stubbed or minimally working -2. establish a working yt-dlp baseline - - archive behavior is preserved - - output path behavior is preserved or intentionally updated - - use yt-dlp .conf and set reasonable default -3. expose basic hooks/interfaces for future frontends - - submit/request path exists - - status/progress hook exists - - basic health/version visibility exists - -** evidence -- commit: -- tests: -- datetime: - -** notes - -* [ ] 2.0.2: update discord bot to use new backend (3) -update the discord bot into a thin frontend that talks to the backend and verify the flow end to end -** pm notes -- this is the first real frontend proof. once this works cleanly, zulip/xmpp should mostly be adapter work rather than downloader rewrites. -- keep discord logic thin; no auth -- do not duplicate yt-dlp behavior in the bot - -** Acceptance Criteria -1. discord bot submits requests to backend - - command/input handling works - - acceptance/busy/failure responses are clear -2. discord bot relays useful backend status - - progress reporting works at a basic level - - completion/failure/skipped outcomes are surfaced -3. backend-discord flow is tested end to end - - valid request path tested - - busy or conflict behavior tested - - failure path tested - -** evidence -- commit: -- tests: -- datetime: - -** notes - -* [ ] 2.0.3: remove deprecated discord-bot functionality (2) -delete or retire legacy bot behaviors that no longer fit once the backend split is in place -** pm notes -- only remove this after the new path works. this is cleanup, not pioneering work. -- favor deletion over compatibility shims -- keep operator controls only if still useful - -** Acceptance Criteria -1. remove obsolete auth/user-management behavior - - old user persistence and commands are removed - - backend-facing flow no longer depends on them -2. remove obsolete downloader/runtime logic from bot - - bot no longer owns yt-dlp execution - - dead code paths are deleted -3. leave the bot in a coherent state - - remaining commands reflect actual supported behavior - - deprecated artifacts are clearly removed or marked - -** evidence -- commit: -- tests: -- datetime: - -** notes - -* [ ] 2.0.5: fix automation and build pipeline (3) -repair and simplify the build/update/deploy path so it matches the new backend-plus-frontend structure -** pm notes -- this should come after architecture and discord integration stabilize. no point polishing the pipeline for the wrong shape. -- optimize for simple manual ops first -- stop here after pipeline is sane - -** Acceptance Criteria -1. align build artifacts with the new structure - - docker/build scripts reflect current components - - runtime assumptions are consistent -2. review old automation artifacts - - stale runner/update/restart logic is removed or updated - - manual update/rebuild flow is clear -3. confirm deployment path works - - local or unraid deployment is validated - - pipeline is understandable enough to maintain - -** evidence -- commit: -- tests: -- datetime: - -** notes