enforce single-job blocking

2026-03-31 13:45:17 -04:00
parent 033d9dd167
commit 667b06fe4a
3 changed files with 278 additions and 76 deletions
--- a/pm/task-sample.org
+++ b/pm/task-sample.org
@@ -0,0 +1,22 @@
 #+title: Task Log
 #+updated: [2026-03-18 Wed 14:19]
 Use the template below, which should be a top-level org-mode header.
 * [ ] M.m.m: Task Title (estimate # commits)
 replace the old observed/canonical workflow with a review-first pipeline that groups normalized rows only during review/combine and links them to catalog items
 ** Acceptance Criteria
 1. Criterion
   - expanded data
 2. Criterion 
 - pm note: amplifying information
 ** evidence
 - commit: abc123, bcd234
 - tests: 
 - datetime: [2026-03-18 Wed 14:15]
 ** notes    
 - explanation of work done, decisions made, reasoning
--- a/pm/tasks.org
+++ b/pm/tasks.org
@@ -0,0 +1,123 @@
 #+title: Youdis Task Log 
 #+updated: [2026-03-31 Tue 08:00]
 * [X] 1.1.1: stabilize youdis core bot behavior (estimate 3 commits)
 refactor the current `youdis.py` flow so authorization, download execution, and user feedback are correct and predictable without changing the product shape.  keep this narrowly scoped to correctness and maintainability; do not redesign into a queueing platform yet.  preserve archive-first behavior and dm status updates; do not add new infrastructure dependencies and prefer boring explicit state over clever concurrency.
 ** acceptance criteria
 1. initialize and load `/config/users.json` safely in all cases
   - create parent dirs before touch/open
   - ensure `authorized_users` always has a valid default
   - normalize stored ids to a single type
 2. fix command-path correctness for `/youtube`, `/adduser`, and `/removeuser`
   - authorized users can successfully invoke downloads
   - add/remove user commands persist changes correctly
   - remove broken/incomplete code paths
 3. duplicate prevention relies on archive.txt
 ** pm notes
 ** evidence
 - commit: 033d9dd
 - tests: ~python3 -m py_compile ./youdis.py~
 - datetime: [2026-03-31 Tue 13:28]
 ** notes
 - store Discord user ids as strings in `users.json`
 - duplicate prevention should continue to rely on `archive.txt`, not inferred hook errors
 * [ ] 1.1.2: remove global mutable download state and define single-job semantics (estimate 2 commits)
 eliminate shared mutable hook state and make concurrent behavior explicit, even if the initial policy is just "one active job at a time."  don't build a scheduler; ok if simplest outcome is single active job with clear busy message.  cancellation can be coarse if yt-dlp/process boundaries make graceful stop annoying
 ** acceptance criteria
 1. improve runtime handling for downloads
   - replace brittle thread/join pattern with a simpler async-safe execution path
   - catch and report real yt-dlp failures
   - avoid misleading "already exists" error assumptions
 2. progress reporting is isolated per request
   - no module-level mutable title state shared across jobs
   - hooks derive state from request-local context
 3. active-job behavior is explicit
   - either reject a second request while busy or implement a minimal tracked active job
   - user-facing response explains current behavior
 4. `/interrupt` is either implemented minimally or downgraded honestly
   - no fake command implying cancellation works when it does not
   - command behavior matches implementation
 ** evidence
 - commit:
 - tests:
 - datetime:
 ** notes
 - verify slash-command response patterns against the `interactions` library while touching runtime flow
 * [ ] 1.1.3: move static yt-dlp behavior into config and shrink python surface area (estimate 2 commits)
 shift stable downloader options into `default-yt-dlp.conf` so the bot code only handles dynamic inputs and orchestration.  optimize for inspectability and low-friction manual ops.  keep output naming durable enough for plex/plain-file use.  avoid duplicating config values across code and conf.
 ** acceptance criteria
 1. separate static vs dynamic yt-dlp options cleanly
   - stable defaults live in `default-yt-dlp.conf`
   - python injects only request-specific/runtime values
 2. preserve archive and output behavior
   - `archive.txt` remains the duplicate-prevention mechanism
   - output paths remain stable and browseable
 3. document config ownership
   - clarify which settings belong in config vs code
   - make future yt-dlp tuning possible without major python edits
 ** evidence
 - commit:
 - tests:
 - datetime:
 ** notes
 * [ ] 1.1.4: simplify image/build/update workflow around manual ops (estimate 3 commits)
 reduce repo cruft from the gitea-runner/nightly-update experiment and replace it with explicit manual update/rebuild mechanics.
 ** acceptance criteria
 1. define a manual update path for yt-dlp and app image lifecycle
   - document or script manual `git pull`, rebuild, and redeploy
   - remove or quarantine brittle auto-update assumptions
 2. review and simplify `update-ytdlp.sh`, workflow yaml, and weekly restart artifacts
   - keep only artifacts that serve the current manual-ops model
   - delete or mark deprecated anything tied to abandoned automation paths
 3. retain unraid deployment viability
   - container can still be rebuilt and redeployed cleanly on jeeves
   - resulting flow is understandable without rereading old ci experiments
 - pm note: weekly restart is presumed suspect until proven necessary
 ** evidence
 - commit:
 - tests:
 - datetime:
 ** notes
 - do not let runner/workflow complexity dominate a small bot
 - prefer explicit version pinning or manual binary refresh over magical nightlies
 * [ ] 1.1.5: clean up packaging/deployment artifacts for unraid consumption (estimate 2 commits)
 make the dockerfile, run script, and unraid-ca template consistent with the refactored app so deployment is less of a ritual ordeal.
 ** acceptance criteria
 1. align docker/runtime assumptions
   - paths like `/config` and `/downloads` are consistent across code, scripts, and container metadata
   - env vars are documented and validated
 2. review deployment artifacts for drift
   - `dockerfile`, `run-youdis.sh`, and `unraid-ca-template.xml` reflect current behavior
   - remove stale references and dead assumptions
 3. make fresh deployment understandable
   - a new deploy on unraid is possible without reconstructing tribal knowledge from old files
 - pm note: this is packaging polish after core correctness, not before
 ** evidence
 - commit:
 - tests:
 - datetime:
 ** notes
 - keep container surface area small
 - optimize for “future me can redeploy this without cursing past me too hard”
--- a/youdis.py
+++ b/youdis.py
@@ -48,32 +48,69 @@ def load_authorized_users():
    return authorized_users
 authorized_users = load_authorized_users()
-    
+
-title = ''
+active_job_lock = threading.Lock()
-
+active_job = None
-async def send_message(ctx, message):
+
-    await ctx.author.send(message)
+async def send_message(ctx, message):
-
+    await ctx.author.send(message)
-def download_video(url, options):
+
-    with yt_dlp.YoutubeDL(options) as ydl:
+def claim_active_job(job):
-        ydl.download(url)
+    global active_job
-        
+    with active_job_lock:
-def create_hook(ctx,loop):
+        if active_job is not None:
-    def hook(d):
+            return active_job
-        global title
+        active_job = job
-        status = d.get('status')
+        return None
-        if status == 'error':
+
-            msg = f'error; video probably already exists, have you checked archive.txt'
+def get_active_job():
-            asyncio.run_coroutine_threadsafe(send_message(ctx,msg),loop)
+    with active_job_lock:
-        elif d.get('info_dict').get('title') != title:
+        return active_job
-            title = d.get('info_dict').get('title')
+
-            playlist_index = d.get('info_dict').get('playlist_index')
+def clear_active_job(job):
-            playlist_count = d.get('info_dict').get('playlist_count')
+    global active_job
-            filename = d.get('filename')
+    with active_job_lock:
-            url = d.get('info_dict').get('webpage_url')
+        if active_job is job:
-            msg = f'{status} {playlist_index} of {playlist_count}: {filename} <{url}>'
+            active_job = None
-            asyncio.run_coroutine_threadsafe(send_message(ctx,msg),loop)
+
-    return hook
+def download_video(url, options):
    with yt_dlp.YoutubeDL(options) as ydl:
        ydl.download(url)
 def create_hook(ctx, loop, cancel_event):
    seen_updates = set()
    def hook(d):
        if cancel_event.is_set():
            raise yt_dlp.utils.DownloadCancelled('download canceled by /interrupt')
        status = d.get('status')
        info = d.get('info_dict') or {}
        if status not in {'downloading', 'finished'}:
            return
        filename = d.get('filename') or info.get('_filename') or info.get('title')
        update_key = (status, filename)
        if update_key in seen_updates:
            return
        seen_updates.add(update_key)
        playlist_index = info.get('playlist_index')
        playlist_count = info.get('playlist_count')
        url = info.get('webpage_url')
        prefix = status
        if playlist_index and playlist_count:
            prefix = f'{status} {playlist_index} of {playlist_count}'
        msg = f'{prefix}: {filename}'
        if url:
            msg = f'{msg} <{url}>'
        asyncio.run_coroutine_threadsafe(send_message(ctx, msg), loop)
    return hook
@interactions.slash_command(name="youtube",description="download video from youtube to server")
@interactions.slash_option(
@@ -84,52 +121,77 @@ def create_hook(ctx,loop):
 )
 async def youtube(ctx: interactions.SlashContext, url:str):
    print(f'{ctx.author.id} requested {url}')
    loop = asyncio.get_running_loop()
    hook = create_hook(ctx,loop)
    # use api_to_cli and paste cli options to get the output you need
    yoptions = {
        'format':'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best',
        'fragment_tries': 10,
        'restrictfilenames':True,
        'paths': {'home':'/downloads'},
        'retries':10,
        'writeinfojson':False,
        'allow_playlist_files':True,
        'noplaylist':True,
        'download_archive':'/config/archive.txt',
        'progress_hooks':[hook],
        'outtmpl': '%(uploader)s/%(playlist_title)s/%(playlist_index)s%(playlist_index& - )s%(title)s.%(ext)s',
        'outtmpl_na_placeholder':'',
    }
    # check that user is authorized
    if str(ctx.author.id) not in authorized_users:
        if ctx.author.id == 127831327012683776:
            await ctx.author.send('potato stop')
        await ctx.author.send('you are not authorized to use this command. message my owner to be added.')
-        return
+        return
-    else:
+
-        await ctx.channel.send(f'Downloading from <{url}>. Status updates via DM.')
+    loop = asyncio.get_running_loop()
-        #await ctx.defer()  #if you need up to 15m to respond
+    cancel_event = threading.Event()
-
+    hook = create_hook(ctx, loop, cancel_event)
-        # 1/2 - download in separate thread, else progress_hook blocks downstream async ctx.send
+    job = {
-        download_thread = threading.Thread(target=download_video, args=(url,yoptions))
+        'requester_id': str(ctx.author.id),
-        download_thread.start()
+        'request_url': url,
-        await asyncio.to_thread(download_thread.join)
+        'cancel_event': cancel_event,
-
+    }
-        # 2/2 - replace the above with this next try:
+    existing_job = claim_active_job(job)
-        #try:
+    if existing_job:
-        #    await asyncio.to_thread(download_video, url, yoptions)
+        await ctx.author.send(
-        #except Exception as e:
+            f'already downloading for <@{existing_job["requester_id"]}>. '
-        #    print(f"download failed: {e}")
+            'single-job mode is enabled right now; try again after it finishes.'
-        #    await ctx.author.send(f"download failed: {str(e)}")
+        )
-
+        return
-
+
-@interactions.slash_command(name="interrupt",description="cancel current job")
+    # use api_to_cli and paste cli options to get the output you need
-@interactions.check(interactions.is_owner())
+    yoptions = {
-async def _interrupt(ctx):
+        'format':'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best',
-    # interrupt here
+        'fragment_tries': 10,
-    print('interrupting current job - not implemented')
+        'restrictfilenames':True,
-    await ctx.author.send('interrupting current job - not implemented')
+        'paths': {'home':'/downloads'},
        'retries':10,
        'writeinfojson':False,
        'allow_playlist_files':True,
        'noplaylist':True,
        'download_archive':'/config/archive.txt',
        'progress_hooks':[hook],
        'outtmpl': '%(uploader)s/%(playlist_title)s/%(playlist_index)s%(playlist_index& - )s%(title)s.%(ext)s',
        'outtmpl_na_placeholder':'',
    }
    await ctx.channel.send(f'Downloading from <{url}>. Status updates via DM. Single-job mode is enabled.')
    try:
        await asyncio.to_thread(download_video, url, yoptions)
    except yt_dlp.utils.DownloadCancelled as exc:
        print(f'download canceled: {exc}')
        await ctx.author.send(f'download canceled: {exc}')
    except yt_dlp.utils.DownloadError as exc:
        print(f'download failed: {exc}')
        await ctx.author.send(f'download failed: {exc}')
    except Exception as exc:
        print(f'unexpected download failure: {exc}')
        await ctx.author.send(f'unexpected download failure: {exc}')
    else:
        await ctx.author.send(f'download complete for <{url}>')
    finally:
        clear_active_job(job)
@interactions.slash_command(name="interrupt",description="cancel current job")
@interactions.check(interactions.is_owner())
 async def _interrupt(ctx):
    job = get_active_job()
    if not job:
        await ctx.author.send('no active download to interrupt')
        return
    job['cancel_event'].set()
    print(f'interrupt requested for {job["request_url"]}')
    await ctx.author.send(
        f'interrupt requested for <{job["request_url"]}>; '
        'cancellation is coarse and will stop on the next yt-dlp progress update'
    )
@interactions.slash_command(name="adduser",description="authorize target user")
@interactions.slash_option(
@@ -166,13 +228,8 @@ async def _removeuser(ctx: interactions.SlashContext, user:interactions.OptionTy
        await ctx.author.send(f'deauthorized {user.mention}')
    else:
        await ctx.author.send(f'{user.mention} is not currently authorized')
-
+
-async def dl_hook(d):
+api_token = getenv('api_token')
-    msg = f'{d["status"]} {d["filename"]}'
+if not api_token:
-    print(msg)
+    raise ValueError('API token not set. Retrieve from your Discord bot.')
    await ctx.author.send(msg)
 api_token = getenv('api_token')
 if not api_token:
    raise ValueError('API token not set. Retrieve from your Discord bot.')
 bot.start(api_token)