Files
youdis/pm/arch-v2.org
2026-03-31 16:53:28 -04:00

5.9 KiB
Raw Blame History

youdis v2 architecture

Purpose

Youdis v2 splits the current monolithic Discord bot into:

  • a private backend worker service that owns yt-dlp execution
  • thin chat adapters that translate user actions into backend requests

The goal is to make downloader behavior inspectable, reusable, and easier to debug across multiple frontends without introducing queueing infrastructure or complex deployment.

Goals

  • Keep the backend transport-agnostic.
  • Preserve archive-first duplicate prevention via `archive.txt`.
  • Keep single-job semantics explicit.
  • Treat requester and origin metadata as informational, not authorization.
  • Make yt-dlp configuration understandable and debuggable.

Non-Goals

  • No backend auth or user membership logic.
  • No database, Redis, or external queue.
  • No multi-worker scheduling in v2.
  • No frontend-owned downloader logic.

Proposed Components

Backend worker service

Owns:

  • request validation
  • single active job state
  • yt-dlp configuration loading and merge rules
  • download execution
  • archive behavior
  • output path behavior
  • progress and final result events

Does not own:

  • Discord slash-command parsing
  • chat formatting decisions
  • access control policy

Frontend adapters

  1. Discord
  2. Zulip
  3. XMPP

Own:

  • command parsing
  • user-facing message formatting
  • transport-specific reply behavior
  • optional frontend-side gating if the deployment wants it

Do not own:

  • yt-dlp option construction
  • archive checks
  • output path logic
  • job lifecycle rules

Initial Deployment Shape

V2 should start as a single private process boundary with two layers:

  1. backend service process
  2. one Discord adapter process

The backend is intended for private/local deployment. Trust comes from deployment topology and network placement, not backend auth.

Backend Contract

Request model

Minimal fields:

  • `url`
  • `requester_id` optional
  • `requester_name` optional
  • `origin` optional
  • `requested_at` optional

Only `url` affects downloader behavior in v2. The rest is passthrough metadata for logs and frontend context.

Result states

The backend should normalize job outcomes into a small stable vocabulary:

  • `accepted`
  • `busy`
  • `validating`
  • `downloading`
  • `skipped`
  • `completed`
  • `failed`
  • `cancelled`

These states should be transport-agnostic and human-debuggable.

Minimal endpoints

The first backend seam should stay small:

  • `POST /jobs`
  • `GET /jobs/current`
  • `POST /jobs/current/cancel`
  • `GET /health`
  • `GET /version`

Polling is the default integration model for v2. Streaming or push delivery is deferred unless a real frontend need appears.

Single-Job Semantics

V2 explicitly supports one active job at a time.

  • If idle, the backend accepts a new job and marks it active.
  • If busy, the backend rejects the new request with a `busy` result.
  • Cancel is coarse and best-effort.
  • The backend clears active state when the job reaches `skipped`, `completed`, `failed`, or `cancelled`.

This is a feature, not a limitation, for the initial service.

yt-dlp Configuration Ownership

This is the most important v2 cleanup.

The backend owns all yt-dlp configuration behavior. Frontends must not build yt-dlp option dictionaries.

Config split

Static settings belong in `default-yt-dlp.conf`:

  • archive path
  • output template
  • embedding and metadata preferences
  • stable format defaults
  • retry defaults

Runtime settings belong in backend code:

  • target URL
  • progress hooks
  • cancellation hook
  • request-scoped metadata
  • any values that must vary per request

Merge rule

The backend should load the default config first, then apply a small set of explicit runtime overrides.

If a runtime override conflicts with file-backed config, the override wins and the backend should log the effective value used.

Debuggability requirement

For each job, the backend should make it easy to inspect:

  • config file path used
  • whether config loading succeeded
  • effective key yt-dlp options after merge
  • final normalized job result

This exists specifically because the current config behavior has been difficult to reason about.

Config hygiene expectations

The default config should be safe for real downloads in production-like use.

That means test-only settings such as `simulate` should not remain enabled in the default runtime path unless the backend intentionally supports an explicit dry-run mode.

Recommended Repo Shape

One reasonable first cut:

youdis/ README.md default-yt-dlp.conf youdis/ init.py api.py config.py models.py worker.py ytdlp_backend.py adapters/ init.py discord.py docs/ architecture-v2.md

Notes:

  • `ytdlp_backend.py` owns yt-dlp integration and result normalization.
  • `worker.py` owns active-job state and lifecycle.
  • `api.py` exposes the minimal HTTP seam.
  • `adapters/discord.py` becomes a thin client of the backend.

Framework Recommendation

FastAPI is the current recommendation for the backend seam.

Why it fits:

  • typed request and response models help define the contract cleanly
  • built-in docs make local inspection easier during early iterations
  • it stays small enough for this service if we keep the surface area disciplined

Why this is not a strong ideological choice:

  • Flask would also work if we decide the type/model ergonomics are not worth it
  • the important decision is keeping the seam small, not choosing a fashionable framework

Immediate Next Task

`2.0.1` should implement the backend skeleton with just enough real behavior to prove the seam:

  • package/module layout
  • health and version endpoint
  • single active-job state holder
  • one submit-job path
  • one current-job status path
  • yt-dlp config loading from `default-yt-dlp.conf`
  • explicit runtime overrides for request URL, progress hooks, and cancel behavior

The Discord adapter should remain unchanged until that backend skeleton can accept a job and report status coherently.