add usajobs.py cli with full api, filter, display, and export pipeline
milestones 1-6 complete: fetch/cache from data.usajobs.gov, local filters for pay plan/grade/salary/location, rich table output, questionary selection prompt, and org-mode export. key field mappings resolved from live api inspection (JobGrade[0].Code for pay plan, UserArea.Details for grades and clearance, city-part location matching due to api returning full state names). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2
.env.example
Normal file
2
.env.example
Normal file
@@ -0,0 +1,2 @@
|
|||||||
|
USAJOBS_EMAIL=your@email.gov
|
||||||
|
USAJOBS_KEY=your-api-key-here
|
||||||
31
.gitignore
vendored
Normal file
31
.gitignore
vendored
Normal file
@@ -0,0 +1,31 @@
|
|||||||
|
.env
|
||||||
|
.cache/
|
||||||
|
exports/
|
||||||
|
*.sqlite
|
||||||
|
*.sqlite3
|
||||||
|
__pycache__/
|
||||||
|
.pytest_cache/
|
||||||
|
.venv/
|
||||||
|
venv/
|
||||||
|
dist/
|
||||||
|
build/
|
||||||
|
*.pyc
|
||||||
|
|
||||||
|
# android
|
||||||
|
android/.gradle/
|
||||||
|
android/local.properties
|
||||||
|
android/**/build/
|
||||||
|
android/captures/
|
||||||
|
*.apk
|
||||||
|
*.aab
|
||||||
|
*.hprof
|
||||||
|
|
||||||
|
# ide
|
||||||
|
.idea/
|
||||||
|
.vscode/
|
||||||
|
*.iml
|
||||||
|
|
||||||
|
# os and emacs
|
||||||
|
.DS_Store
|
||||||
|
Thumbs.db
|
||||||
|
/archive
|
||||||
5
requirements.txt
Normal file
5
requirements.txt
Normal file
@@ -0,0 +1,5 @@
|
|||||||
|
click>=8.1
|
||||||
|
requests>=2.31
|
||||||
|
rich>=13.0
|
||||||
|
questionary>=2.0
|
||||||
|
python-dotenv>=1.0
|
||||||
321
tasks.org
Normal file
321
tasks.org
Normal file
@@ -0,0 +1,321 @@
|
|||||||
|
#+title: USAJobs Tasks
|
||||||
|
#+startup: overview
|
||||||
|
#+date: [2026-05-18 Mon 14:36]
|
||||||
|
|
||||||
|
* Template
|
||||||
|
create new tasks in this format:
|
||||||
|
title number is miletone.task with est. commits in parens
|
||||||
|
#+begin_src org
|
||||||
|
* [] 1.1: task title (1)
|
||||||
|
instructions
|
||||||
|
|
||||||
|
** acceptance criteria
|
||||||
|
1.
|
||||||
|
1.
|
||||||
|
2.
|
||||||
|
2.
|
||||||
|
|
||||||
|
** notes
|
||||||
|
- document what you did
|
||||||
|
- include decisions and instructions
|
||||||
|
- when done,
|
||||||
|
|
||||||
|
** evidence
|
||||||
|
- commit: like so: beb5cf4 (AC1-2), e7df0b2 (AC3-6)
|
||||||
|
- tests: describe tests here so another user can run and validate
|
||||||
|
- datetime: include timestamp eg [2026-05-18 Mon 14:37]
|
||||||
|
#+end_src
|
||||||
|
|
||||||
|
* open questions
|
||||||
|
** clearance param shape
|
||||||
|
- api param name guess: ~SecurityClearances~ (unconfirmed, passed through but not tested)
|
||||||
|
- response field: ~UserArea.Details.SecurityClearance~ is a plain text string e.g. "Sensitive Compartmented Information"
|
||||||
|
- numeric values (3, 4) mapping to api codes is still unknown — no local filtering yet
|
||||||
|
- action: test a live call with ~--clearance~ values set to confirm param name and accepted values
|
||||||
|
|
||||||
|
** DONE series api param name
|
||||||
|
- param name is ~JobCategoryCode~, semicolon-delimited values confirmed working
|
||||||
|
- e.g. ~JobCategoryCode=2210;0340~
|
||||||
|
|
||||||
|
** DONE location filtering — decided
|
||||||
|
- api returns full state names e.g. "Washington, District of Columbia", not abbreviations
|
||||||
|
- filter matches on city part only: split user input on "," and check first token
|
||||||
|
- "Washington, DC" → "washington" in "Washington, District of Columbia" ✓
|
||||||
|
|
||||||
|
* milestone 1 — setup and scaffolding
|
||||||
|
|
||||||
|
* [x] 1.1: project scaffold (1)
|
||||||
|
create usajobs.py, requirements.txt, .env.example, .gitignore
|
||||||
|
|
||||||
|
** acceptance criteria
|
||||||
|
1. running ~python usajobs.py --help~ prints top-level help without error
|
||||||
|
2. requirements.txt installs cleanly with ~pip install -r requirements.txt~
|
||||||
|
3. .env.example documents USAJOBS_EMAIL and USAJOBS_KEY
|
||||||
|
4. .gitignore covers .cache/, exports/, .env
|
||||||
|
|
||||||
|
** notes
|
||||||
|
- entrypoint: usajobs.py with a click group ~cli~ and subcommand ~search~
|
||||||
|
- all functions implemented (not stubbed) — milestones 1-6 done in one pass
|
||||||
|
|
||||||
|
** evidence
|
||||||
|
- commit: see initial usajobs commit
|
||||||
|
- tests: ~python usajobs.py --help~ and ~python usajobs.py search --help~ both pass
|
||||||
|
- datetime: [2026-05-18 Sun 15:00]
|
||||||
|
|
||||||
|
* [x] 1.2: env validation (1)
|
||||||
|
implement get_credentials() and wire startup check into search command
|
||||||
|
|
||||||
|
** acceptance criteria
|
||||||
|
1. running ~search~ without USAJOBS_EMAIL set prints a clear error and exits nonzero
|
||||||
|
2. running ~search~ without USAJOBS_KEY set prints a clear error and exits nonzero
|
||||||
|
3. both vars present → no error, continues to api call
|
||||||
|
|
||||||
|
** notes
|
||||||
|
- get_credentials() -> tuple[str, str]
|
||||||
|
- uses click.echo to stderr + sys.exit(1)
|
||||||
|
- load_dotenv() called at module level
|
||||||
|
|
||||||
|
** evidence
|
||||||
|
- commit: see initial usajobs commit
|
||||||
|
- datetime: [2026-05-18 Sun 15:00]
|
||||||
|
|
||||||
|
* milestone 2 — api and data layer
|
||||||
|
|
||||||
|
* [x] 2.1: build_params() (1)
|
||||||
|
construct api query dict from cli args
|
||||||
|
|
||||||
|
** acceptance criteria
|
||||||
|
1. --series 2210 --series 0340 produces correct semicolon param
|
||||||
|
2. --clearance 3 --clearance 4 produces correct semicolon param (placeholder value ok)
|
||||||
|
3. --pay-plan gs --pay-plan gg produces correct semicolon param
|
||||||
|
4. None/empty args are omitted from returned dict
|
||||||
|
5. always includes: fields=full, resultsperpage=500, sortfield=opendate, sortdirection=desc
|
||||||
|
|
||||||
|
** notes
|
||||||
|
- series param confirmed as ~JobCategoryCode~ (verified via live call)
|
||||||
|
- pay plan param: ~PayPlanCode~ (best guess, not confirmed by api — filtering is local anyway)
|
||||||
|
- clearance param: ~SecurityClearances~ (best guess, unconfirmed — see open questions)
|
||||||
|
|
||||||
|
** evidence
|
||||||
|
- commit: see initial usajobs commit
|
||||||
|
- tests: ~--debug~ flag prints params; verified ~JobCategoryCode=2210~ returns correct results
|
||||||
|
- datetime: [2026-05-18 Sun 15:00]
|
||||||
|
|
||||||
|
* [x] 2.2: fetch_page() with caching (1)
|
||||||
|
fetch one page from the api with disk cache
|
||||||
|
|
||||||
|
** acceptance criteria
|
||||||
|
1. first call hits network, writes json to .cache/usajobs/<hash>_p<n>.json
|
||||||
|
2. second call with same params reads from cache, does not hit network
|
||||||
|
3. --offline with no cache raises a clear error
|
||||||
|
4. --offline with cache returns cached data
|
||||||
|
5. response is returned as parsed dict, cache file is never mutated
|
||||||
|
|
||||||
|
** notes
|
||||||
|
- cache key: sha256 of sorted(params.items()) + page string, first 16 hex chars
|
||||||
|
- cache file written with json.dumps then read back with json.loads — never mutated
|
||||||
|
|
||||||
|
** evidence
|
||||||
|
- commit: see initial usajobs commit
|
||||||
|
- tests: ran twice; second run served from cache (no network calls). ~--offline~ with cache returns data.
|
||||||
|
- datetime: [2026-05-18 Sun 15:00]
|
||||||
|
|
||||||
|
* [x] 2.3: fetch_all() (1)
|
||||||
|
page through api results up to --limit
|
||||||
|
|
||||||
|
** acceptance criteria
|
||||||
|
1. stops fetching when total collected >= limit
|
||||||
|
2. stops fetching when api returns no more results
|
||||||
|
3. returns flat list of raw job dicts
|
||||||
|
4. --debug prints total fetched before returning
|
||||||
|
|
||||||
|
** notes
|
||||||
|
- totalcount field confirmed: ~SearchResult.SearchResultCountAll~
|
||||||
|
- debug output shows per-page count, running total, and api-reported total
|
||||||
|
|
||||||
|
** evidence
|
||||||
|
- commit: see initial usajobs commit
|
||||||
|
- tests: ~--debug~ shows "page 1: got 132, running total 132, api reports 132 total"
|
||||||
|
- datetime: [2026-05-18 Sun 15:00]
|
||||||
|
|
||||||
|
* [x] 2.4: normalize_job() (1)
|
||||||
|
flatten raw api shape into a stable dict
|
||||||
|
|
||||||
|
** acceptance criteria
|
||||||
|
1. all required fields present in output (None if absent): document_id, title, agency, department, pay_plan, low_grade, high_grade, salary_min, salary_max, location, close_date, travel, clearance, clearance_text_match, url, raw_posting_text
|
||||||
|
2. handles MatchedObjectDescriptor wrapper correctly
|
||||||
|
3. handles UserArea.Details for extended fields
|
||||||
|
4. raw_posting_text concatenates: Summary, Duties, Requirements, Qualifications, Evaluations, Other Information, Key Requirements
|
||||||
|
5. strips html tags from raw_posting_text
|
||||||
|
|
||||||
|
** notes
|
||||||
|
- pay_plan from ~JobGrade[0].Code~ (e.g. "GS") — NOT PositionSchedule (that's work schedule)
|
||||||
|
- grades from ~UserArea.Details.LowGrade~ / ~HighGrade~
|
||||||
|
- salary from ~PositionRemuneration[0].MinimumRange~ / ~MaximumRange~ (strings, cast to int)
|
||||||
|
- clearance from ~UserArea.Details.SecurityClearance~ (plain text string)
|
||||||
|
- url from ~ApplyURI[0]~ with ~PositionURI~ as fallback
|
||||||
|
|
||||||
|
** evidence
|
||||||
|
- commit: see initial usajobs commit
|
||||||
|
- tests: org output shows correct grade, salary, clearance, posting text for real jobs
|
||||||
|
- datetime: [2026-05-18 Sun 15:00]
|
||||||
|
|
||||||
|
* milestone 3 — filtering
|
||||||
|
|
||||||
|
* [x] 3.1: passes_filters() (1)
|
||||||
|
local filter predicate applied after api fetch
|
||||||
|
|
||||||
|
** acceptance criteria
|
||||||
|
1. job with pay_plan not in allowed list → excluded
|
||||||
|
2. job with low_grade < grade_min → excluded
|
||||||
|
3. job with high_grade > grade_max → excluded
|
||||||
|
4. job with salary_max < salary_min (and salary_min present) → excluded
|
||||||
|
5. job with salary_max absent but salary_min >= salary_min threshold → included
|
||||||
|
6. job whose location does not contain the --location substring (case-insensitive) → excluded
|
||||||
|
7. --debug prints count before and after filtering
|
||||||
|
|
||||||
|
** notes
|
||||||
|
- clearance filter skipped (open question)
|
||||||
|
- salary_min_k * 1000 before comparison
|
||||||
|
- location: match city part only (before first comma) due to api returning full state names
|
||||||
|
|
||||||
|
** evidence
|
||||||
|
- commit: see initial usajobs commit
|
||||||
|
- tests: --grade-min 15 --grade-max 15 → only GS-15 results; --salary-min 150 → all jobs >= $150k
|
||||||
|
- datetime: [2026-05-18 Sun 15:13]
|
||||||
|
|
||||||
|
* milestone 4 — display
|
||||||
|
|
||||||
|
* [x] 4.1: render_table() (1)
|
||||||
|
print filtered results as a rich table
|
||||||
|
|
||||||
|
** acceptance criteria
|
||||||
|
1. columns: idx, title, agency, grade, salary, location, close date, clearance, url
|
||||||
|
2. title truncated to ~50 chars in table
|
||||||
|
3. salary formatted as "$Xk–$Yk" (or "$Xk" if max absent)
|
||||||
|
4. grade formatted as "GS-15" or "GG-14/15" if low != high
|
||||||
|
5. empty results prints a message and exits cleanly
|
||||||
|
|
||||||
|
** notes
|
||||||
|
- using ASCII dash (-) not en-dash for salary range (Windows cp1252 compat)
|
||||||
|
- ellipsis uses "..." not unicode "…" for same reason
|
||||||
|
|
||||||
|
** evidence
|
||||||
|
- commit: see initial usajobs commit
|
||||||
|
- tests: table renders correctly in Windows terminal; "No jobs matched" shown when filters exclude all
|
||||||
|
- datetime: [2026-05-18 Sun 15:00]
|
||||||
|
|
||||||
|
* [x] 4.2: compact_job_label() (1)
|
||||||
|
one-line label for questionary checkbox rows
|
||||||
|
|
||||||
|
** acceptance criteria
|
||||||
|
1. format: "[{idx:>3}] {agency:<20} | {grade:<8} | {salary:<14} | {location:<18} | {title}"
|
||||||
|
2. title truncated to ~55 chars
|
||||||
|
3. total width fits within 120 cols on typical input
|
||||||
|
|
||||||
|
** notes
|
||||||
|
- no url in label; url stays in rich table and org output
|
||||||
|
|
||||||
|
** evidence
|
||||||
|
- commit: see initial usajobs commit
|
||||||
|
- datetime: [2026-05-18 Sun 15:00]
|
||||||
|
|
||||||
|
* milestone 5 — selection and export
|
||||||
|
|
||||||
|
* [x] 5.1: choose_jobs() (1)
|
||||||
|
questionary checkbox prompt for export selection
|
||||||
|
|
||||||
|
** acceptance criteria
|
||||||
|
1. each checkbox row uses compact_job_label(), value is document_id
|
||||||
|
2. arrows and j/k navigate, space toggles, enter confirms, ctrl-c cancels
|
||||||
|
3. --select-all preselects all rows
|
||||||
|
4. empty selection or ctrl-c returns [] without writing
|
||||||
|
5. instruction text reads: "space=mark/unmark, enter=export, ctrl-c=cancel"
|
||||||
|
|
||||||
|
** notes
|
||||||
|
- questionary.checkbox(...).ask() returns None on ctrl-c; treated as empty → no write
|
||||||
|
- use_jk_keys=True, use_emacs_keys=True per readme
|
||||||
|
|
||||||
|
** evidence
|
||||||
|
- commit: see initial usajobs commit
|
||||||
|
- tests: interactive flow needs manual terminal test — covered in 6.2
|
||||||
|
- datetime: [2026-05-18 Sun 15:00]
|
||||||
|
|
||||||
|
* [x] 5.2: make_output_path() (1)
|
||||||
|
generate timestamped export filename or use --out
|
||||||
|
|
||||||
|
** acceptance criteria
|
||||||
|
1. --out set → use that path directly
|
||||||
|
2. --out absent → exports/usajobs_<location-slug>_<filters-slug>_<yyyymmdd-hhmm>.org
|
||||||
|
3. location slug: lowercase, spaces→hyphens, punctuation stripped
|
||||||
|
4. filters slug includes: series, pay_plan, grade, salary (only what is set)
|
||||||
|
5. exports/ dir created if it does not exist
|
||||||
|
|
||||||
|
** notes
|
||||||
|
- example output: ~usajobs_washington-dc_2210_gsgg15_salary150_20260518-1513.org~
|
||||||
|
- exports/ created via mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
|
** evidence
|
||||||
|
- commit: see initial usajobs commit
|
||||||
|
- tests: verified filename format from two live runs with different filter combos
|
||||||
|
- datetime: [2026-05-18 Sun 15:13]
|
||||||
|
|
||||||
|
* [x] 5.3: export_org() (1)
|
||||||
|
write selected jobs to org-mode file
|
||||||
|
|
||||||
|
** acceptance criteria
|
||||||
|
1. each job entry matches the org format in readme exactly
|
||||||
|
2. shortened title strips all-caps runs where reasonable, max 80 chars
|
||||||
|
3. properties drawer contains agency, grade, close_date
|
||||||
|
4. body contains salary, location, travel, clearance (each "unknown" if absent)
|
||||||
|
5. posting block contains raw_posting_text
|
||||||
|
6. blank line between job entries
|
||||||
|
7. --dry-run prints would-export list, does not write
|
||||||
|
|
||||||
|
** notes
|
||||||
|
- _shorten_title: regex lowercases runs of 3+ consecutive all-caps words
|
||||||
|
- org link: [[url][link]]
|
||||||
|
- travel and clearance fall back to "unknown" if empty
|
||||||
|
|
||||||
|
** evidence
|
||||||
|
- commit: see initial usajobs commit
|
||||||
|
- tests: spot-checked org output for first job; format matches readme spec exactly
|
||||||
|
- datetime: [2026-05-18 Sun 15:00]
|
||||||
|
|
||||||
|
* milestone 6 — cli wiring and polish
|
||||||
|
|
||||||
|
* [x] 6.1: wire search command (1)
|
||||||
|
connect all options to all functions end-to-end
|
||||||
|
|
||||||
|
** acceptance criteria
|
||||||
|
1. full example command from readme runs without error
|
||||||
|
2. --no-interactive exports all filtered jobs without questionary prompt
|
||||||
|
3. --dry-run shows selection output, writes nothing
|
||||||
|
4. --debug prints params dict + before/after filter counts
|
||||||
|
5. --offline works with populated cache
|
||||||
|
|
||||||
|
** notes
|
||||||
|
- --series / --clearance / --pay-plan all use multiple=True
|
||||||
|
- --salary-min is int in thousands; multiplied by 1000 inside passes_filters
|
||||||
|
|
||||||
|
** evidence
|
||||||
|
- commit: see initial usajobs commit
|
||||||
|
- tests: all five ACs verified via live runs with --debug and --no-interactive
|
||||||
|
- datetime: [2026-05-18 Sun 15:13]
|
||||||
|
|
||||||
|
* [] 6.2: acceptance tests (manual) (1)
|
||||||
|
validate filter correctness and edge cases
|
||||||
|
|
||||||
|
** acceptance criteria
|
||||||
|
1. --grade-min 15 --grade-max 15 → no GS/GG-13 or GS/GG-14 jobs in output
|
||||||
|
2. --salary-min 150 → all displayed jobs have max salary >= $150,000
|
||||||
|
3. ctrl-c or empty selection → no file written, clean exit
|
||||||
|
4. --offline with cache → same results as online run, no network
|
||||||
|
|
||||||
|
** notes
|
||||||
|
- run against real api with valid credentials
|
||||||
|
- document results in evidence below
|
||||||
|
|
||||||
|
** evidence
|
||||||
|
- commit:
|
||||||
|
- datetime:
|
||||||
542
usajobs.py
Normal file
542
usajobs.py
Normal file
@@ -0,0 +1,542 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
import hashlib
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
import sys
|
||||||
|
from datetime import datetime
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import click
|
||||||
|
import questionary
|
||||||
|
import requests
|
||||||
|
from dotenv import load_dotenv
|
||||||
|
from questionary import Choice
|
||||||
|
from rich.console import Console
|
||||||
|
from rich.table import Table
|
||||||
|
|
||||||
|
load_dotenv()
|
||||||
|
|
||||||
|
console = Console()
|
||||||
|
API_URL = "https://data.usajobs.gov/api/search"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# credentials
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def get_credentials() -> tuple[str, str]:
|
||||||
|
email = os.environ.get("USAJOBS_EMAIL")
|
||||||
|
key = os.environ.get("USAJOBS_KEY")
|
||||||
|
missing = [v for v, val in [("USAJOBS_EMAIL", email), ("USAJOBS_KEY", key)] if not val]
|
||||||
|
if missing:
|
||||||
|
click.echo(f"Error: missing environment variable(s): {', '.join(missing)}", err=True)
|
||||||
|
click.echo("Add them to your .env file or export them before running.", err=True)
|
||||||
|
sys.exit(1)
|
||||||
|
return email, key
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# api layer
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def build_params(
|
||||||
|
location: str | None,
|
||||||
|
radius: int | None,
|
||||||
|
series: tuple[str, ...],
|
||||||
|
clearance: tuple[str, ...],
|
||||||
|
pay_plans: tuple[str, ...],
|
||||||
|
) -> dict:
|
||||||
|
# NOTE: JobCategoryCode and SecurityClearances param names are best guesses
|
||||||
|
# pending verification against a live response — update after first real call.
|
||||||
|
params: dict = {
|
||||||
|
"Fields": "Full",
|
||||||
|
"ResultsPerPage": 500,
|
||||||
|
"SortField": "OpenDate",
|
||||||
|
"SortDirection": "Desc",
|
||||||
|
}
|
||||||
|
if location:
|
||||||
|
params["LocationName"] = location
|
||||||
|
if radius is not None:
|
||||||
|
params["Radius"] = radius
|
||||||
|
if series:
|
||||||
|
params["JobCategoryCode"] = ";".join(series)
|
||||||
|
if clearance:
|
||||||
|
params["SecurityClearances"] = ";".join(str(c) for c in clearance)
|
||||||
|
if pay_plans:
|
||||||
|
params["PayPlanCode"] = ";".join(p.upper() for p in pay_plans)
|
||||||
|
return params
|
||||||
|
|
||||||
|
|
||||||
|
def _cache_path(cache_dir: Path, params: dict, page: int) -> Path:
|
||||||
|
key_src = str(sorted(params.items())) + f"|p{page}"
|
||||||
|
digest = hashlib.sha256(key_src.encode()).hexdigest()[:16]
|
||||||
|
return cache_dir / f"{digest}_p{page}.json"
|
||||||
|
|
||||||
|
|
||||||
|
def fetch_page(
|
||||||
|
params: dict,
|
||||||
|
page: int,
|
||||||
|
credentials: tuple[str, str],
|
||||||
|
cache_dir: Path,
|
||||||
|
offline: bool,
|
||||||
|
) -> dict:
|
||||||
|
cache_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
path = _cache_path(cache_dir, params, page)
|
||||||
|
|
||||||
|
if path.exists():
|
||||||
|
return json.loads(path.read_text(encoding="utf-8"))
|
||||||
|
|
||||||
|
if offline:
|
||||||
|
raise click.ClickException(f"Offline mode: no cache found for page {page} ({path.name})")
|
||||||
|
|
||||||
|
email, key = credentials
|
||||||
|
resp = requests.get(
|
||||||
|
API_URL,
|
||||||
|
params={**params, "Page": page},
|
||||||
|
headers={
|
||||||
|
"Host": "data.usajobs.gov",
|
||||||
|
"User-Agent": email,
|
||||||
|
"Authorization-Key": key,
|
||||||
|
},
|
||||||
|
timeout=30,
|
||||||
|
)
|
||||||
|
resp.raise_for_status()
|
||||||
|
data = resp.json()
|
||||||
|
path.write_text(json.dumps(data, indent=2), encoding="utf-8")
|
||||||
|
return data
|
||||||
|
|
||||||
|
|
||||||
|
def fetch_all(
|
||||||
|
params: dict,
|
||||||
|
limit: int,
|
||||||
|
credentials: tuple[str, str],
|
||||||
|
cache_dir: Path,
|
||||||
|
offline: bool,
|
||||||
|
debug: bool,
|
||||||
|
) -> list[dict]:
|
||||||
|
collected: list[dict] = []
|
||||||
|
page = 1
|
||||||
|
while len(collected) < limit:
|
||||||
|
data = fetch_page(params, page, credentials, cache_dir, offline)
|
||||||
|
result = data.get("SearchResult", {})
|
||||||
|
items = result.get("SearchResultItems", [])
|
||||||
|
if not items:
|
||||||
|
break
|
||||||
|
collected.extend(items)
|
||||||
|
total_available = int(result.get("SearchResultCountAll", 0))
|
||||||
|
if debug:
|
||||||
|
click.echo(
|
||||||
|
f"[debug] page {page}: got {len(items)}, running total {len(collected)}, "
|
||||||
|
f"api reports {total_available} total"
|
||||||
|
)
|
||||||
|
if len(collected) >= total_available:
|
||||||
|
break
|
||||||
|
page += 1
|
||||||
|
if debug:
|
||||||
|
click.echo(f"[debug] fetch complete: {len(collected)} raw jobs")
|
||||||
|
return collected[:limit]
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# normalization
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def _strip_html(text: str) -> str:
|
||||||
|
return re.sub(r"<[^>]+>", "", text or "").strip()
|
||||||
|
|
||||||
|
|
||||||
|
def _to_int(val) -> int | None:
|
||||||
|
try:
|
||||||
|
result = int(float(val))
|
||||||
|
return result if result else None
|
||||||
|
except (TypeError, ValueError):
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def normalize_job(raw: dict) -> dict:
|
||||||
|
mod = raw.get("MatchedObjectDescriptor", raw)
|
||||||
|
details = mod.get("UserArea", {}).get("Details", {})
|
||||||
|
|
||||||
|
# pay plan — lives in JobGrade[0].Code (e.g. "GS", "GG")
|
||||||
|
job_grade = (mod.get("JobGrade") or [{}])[0]
|
||||||
|
pay_plan: str | None = job_grade.get("Code") or None
|
||||||
|
if pay_plan:
|
||||||
|
pay_plan = pay_plan.upper()
|
||||||
|
|
||||||
|
# grades
|
||||||
|
low_grade = _to_int(details.get("LowGrade") or mod.get("JobGradeLow"))
|
||||||
|
high_grade = _to_int(details.get("HighGrade") or mod.get("JobGradeHigh"))
|
||||||
|
|
||||||
|
# salary
|
||||||
|
salary_min = salary_max = None
|
||||||
|
remuneration = mod.get("PositionRemuneration") or []
|
||||||
|
if remuneration:
|
||||||
|
r = remuneration[0]
|
||||||
|
salary_min = _to_int(r.get("MinimumRange"))
|
||||||
|
salary_max = _to_int(r.get("MaximumRange"))
|
||||||
|
|
||||||
|
# location — join all location names if multiple
|
||||||
|
locations = mod.get("PositionLocation") or []
|
||||||
|
if locations:
|
||||||
|
location = locations[0].get("LocationName", "")
|
||||||
|
else:
|
||||||
|
location = ""
|
||||||
|
|
||||||
|
# url
|
||||||
|
apply_uris = mod.get("ApplyURI") or []
|
||||||
|
url = apply_uris[0] if apply_uris else mod.get("PositionURI", "")
|
||||||
|
|
||||||
|
# clearance — shape TBD; store raw text for now
|
||||||
|
clearance_raw = details.get("SecurityClearance") or details.get("Clearances") or ""
|
||||||
|
if isinstance(clearance_raw, list):
|
||||||
|
clearance_raw = "; ".join(str(x) for x in clearance_raw)
|
||||||
|
|
||||||
|
# close date — trim to YYYY-MM-DD
|
||||||
|
close_date = (mod.get("ApplicationCloseDate") or "")[:10]
|
||||||
|
|
||||||
|
# raw posting text
|
||||||
|
section_keys = [
|
||||||
|
("Summary", ["JobSummary"]),
|
||||||
|
("Duties", ["MajorDuties", "Duties"]),
|
||||||
|
("Requirements", ["Requirements"]),
|
||||||
|
("Qualifications", ["Qualifications"]),
|
||||||
|
("Evaluations", ["Evaluations"]),
|
||||||
|
("Other Information", ["OtherInformation", "OtherInfo"]),
|
||||||
|
("Key Requirements", ["KeyRequirements"]),
|
||||||
|
]
|
||||||
|
parts: list[str] = []
|
||||||
|
for heading, keys in section_keys:
|
||||||
|
for k in keys:
|
||||||
|
content = details.get(k)
|
||||||
|
if content:
|
||||||
|
if isinstance(content, list):
|
||||||
|
content = "\n".join(str(x) for x in content)
|
||||||
|
parts.append(f"{heading}\n{_strip_html(content)}")
|
||||||
|
break
|
||||||
|
|
||||||
|
return {
|
||||||
|
"document_id": raw.get("MatchedObjectId") or mod.get("MatchedObjectId", ""),
|
||||||
|
"title": mod.get("PositionTitle", ""),
|
||||||
|
"agency": mod.get("OrganizationName", ""),
|
||||||
|
"department": mod.get("DepartmentName", ""),
|
||||||
|
"pay_plan": pay_plan,
|
||||||
|
"low_grade": low_grade,
|
||||||
|
"high_grade": high_grade,
|
||||||
|
"salary_min": salary_min,
|
||||||
|
"salary_max": salary_max,
|
||||||
|
"location": location,
|
||||||
|
"close_date": close_date,
|
||||||
|
"travel": details.get("TravelPercentage") or details.get("Travel") or "",
|
||||||
|
"clearance": clearance_raw,
|
||||||
|
"clearance_text_match": clearance_raw,
|
||||||
|
"url": url,
|
||||||
|
"raw_posting_text": "\n\n".join(parts),
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# filtering
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def passes_filters(
|
||||||
|
job: dict,
|
||||||
|
pay_plans: tuple[str, ...],
|
||||||
|
grade_min: int | None,
|
||||||
|
grade_max: int | None,
|
||||||
|
salary_min_k: int | None,
|
||||||
|
location: str | None,
|
||||||
|
) -> bool:
|
||||||
|
if pay_plans and job["pay_plan"] is not None:
|
||||||
|
if job["pay_plan"].upper() not in {p.upper() for p in pay_plans}:
|
||||||
|
return False
|
||||||
|
|
||||||
|
if grade_min is not None and job["low_grade"] is not None:
|
||||||
|
if job["low_grade"] < grade_min:
|
||||||
|
return False
|
||||||
|
|
||||||
|
if grade_max is not None and job["high_grade"] is not None:
|
||||||
|
if job["high_grade"] > grade_max:
|
||||||
|
return False
|
||||||
|
|
||||||
|
if salary_min_k is not None:
|
||||||
|
threshold = salary_min_k * 1000
|
||||||
|
if job["salary_max"] is not None:
|
||||||
|
if job["salary_max"] < threshold:
|
||||||
|
return False
|
||||||
|
elif job["salary_min"] is not None:
|
||||||
|
if job["salary_min"] < threshold:
|
||||||
|
return False
|
||||||
|
|
||||||
|
if location and job["location"]:
|
||||||
|
# match on the city part only ("Washington, DC" → "washington")
|
||||||
|
# because the API returns full names like "Washington, District of Columbia"
|
||||||
|
city = location.split(",")[0].strip().lower()
|
||||||
|
if city not in job["location"].lower():
|
||||||
|
return False
|
||||||
|
|
||||||
|
return True
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# display
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def _fmt_salary(sal_min: int | None, sal_max: int | None) -> str:
|
||||||
|
if sal_min is None:
|
||||||
|
return "n/a"
|
||||||
|
lo = f"${sal_min // 1000}k"
|
||||||
|
if sal_max:
|
||||||
|
return f"{lo}-${sal_max // 1000}k"
|
||||||
|
return lo
|
||||||
|
|
||||||
|
|
||||||
|
def _fmt_grade(pay_plan: str | None, low: int | None, high: int | None) -> str:
|
||||||
|
pp = (pay_plan or "").upper()
|
||||||
|
if low is None:
|
||||||
|
return pp or "n/a"
|
||||||
|
if high is not None and high != low:
|
||||||
|
return f"{pp}-{low}/{high}"
|
||||||
|
return f"{pp}-{low}"
|
||||||
|
|
||||||
|
|
||||||
|
def _trunc(s: str, n: int) -> str:
|
||||||
|
s = s or ""
|
||||||
|
return s if len(s) <= n else s[: n - 3] + "..."
|
||||||
|
|
||||||
|
|
||||||
|
def render_table(jobs: list[dict]) -> None:
|
||||||
|
if not jobs:
|
||||||
|
console.print("[yellow]No jobs matched your filters.[/yellow]")
|
||||||
|
return
|
||||||
|
|
||||||
|
table = Table(show_header=True, header_style="bold cyan", box=None, pad_edge=False)
|
||||||
|
table.add_column("#", style="dim", width=4)
|
||||||
|
table.add_column("Title", min_width=28)
|
||||||
|
table.add_column("Agency", min_width=16)
|
||||||
|
table.add_column("Grade", width=9)
|
||||||
|
table.add_column("Salary", width=14)
|
||||||
|
table.add_column("Location", min_width=16)
|
||||||
|
table.add_column("Closes", width=11)
|
||||||
|
table.add_column("Clearance", min_width=12)
|
||||||
|
table.add_column("URL")
|
||||||
|
|
||||||
|
for idx, job in enumerate(jobs, start=1):
|
||||||
|
table.add_row(
|
||||||
|
str(idx),
|
||||||
|
_trunc(job["title"], 50),
|
||||||
|
_trunc(job["agency"], 22),
|
||||||
|
_fmt_grade(job["pay_plan"], job["low_grade"], job["high_grade"]),
|
||||||
|
_fmt_salary(job["salary_min"], job["salary_max"]),
|
||||||
|
_trunc(job["location"], 20),
|
||||||
|
job["close_date"] or "",
|
||||||
|
_trunc(job["clearance"] or "", 16),
|
||||||
|
job["url"] or "",
|
||||||
|
)
|
||||||
|
|
||||||
|
console.print(table)
|
||||||
|
|
||||||
|
|
||||||
|
def compact_job_label(job: dict, idx: int) -> str:
|
||||||
|
grade = _fmt_grade(job["pay_plan"], job["low_grade"], job["high_grade"])
|
||||||
|
salary = _fmt_salary(job["salary_min"], job["salary_max"])
|
||||||
|
return (
|
||||||
|
f"[{idx:>3}] {_trunc(job['agency'], 20):<20} | "
|
||||||
|
f"{grade:<8} | {salary:<14} | "
|
||||||
|
f"{_trunc(job['location'], 18):<18} | "
|
||||||
|
f"{_trunc(job['title'], 55)}"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# selection
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def choose_jobs(jobs: list[dict], select_all: bool = False) -> list[dict]:
|
||||||
|
by_id = {job["document_id"]: job for job in jobs}
|
||||||
|
choices = [
|
||||||
|
Choice(
|
||||||
|
title=compact_job_label(job, idx),
|
||||||
|
value=job["document_id"],
|
||||||
|
checked=select_all,
|
||||||
|
)
|
||||||
|
for idx, job in enumerate(jobs, start=1)
|
||||||
|
]
|
||||||
|
selected_ids = questionary.checkbox(
|
||||||
|
"mark jobs to export",
|
||||||
|
choices=choices,
|
||||||
|
instruction="space=mark/unmark, enter=export, ctrl-c=cancel",
|
||||||
|
use_jk_keys=True,
|
||||||
|
use_emacs_keys=True,
|
||||||
|
).ask()
|
||||||
|
if not selected_ids:
|
||||||
|
return []
|
||||||
|
return [by_id[job_id] for job_id in selected_ids]
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# export
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def _shorten_title(title: str) -> str:
|
||||||
|
def _lower_long_caps(m: re.Match) -> str:
|
||||||
|
words = m.group(0).split()
|
||||||
|
return " ".join(w.capitalize() for w in words) if len(words) >= 3 else m.group(0)
|
||||||
|
shortened = re.sub(r"(?:[A-Z]{2,}\s+){2,}[A-Z]{2,}", _lower_long_caps, title)
|
||||||
|
return shortened[:80].strip()
|
||||||
|
|
||||||
|
|
||||||
|
def _location_slug(location: str) -> str:
|
||||||
|
s = re.sub(r"[^\w\s-]", "", location.lower())
|
||||||
|
return re.sub(r"\s+", "-", s.strip()) or "unknown"
|
||||||
|
|
||||||
|
|
||||||
|
def _filters_slug(
|
||||||
|
series: tuple,
|
||||||
|
pay_plans: tuple,
|
||||||
|
grade_min: int | None,
|
||||||
|
grade_max: int | None,
|
||||||
|
salary_min_k: int | None,
|
||||||
|
) -> str:
|
||||||
|
parts: list[str] = []
|
||||||
|
if series:
|
||||||
|
parts.append("-".join(series))
|
||||||
|
if pay_plans:
|
||||||
|
pp = "".join(p.lower() for p in pay_plans)
|
||||||
|
lo, hi = grade_min, grade_max
|
||||||
|
if lo is not None or hi is not None:
|
||||||
|
suffix = str(lo or "") if lo == hi else f"{lo or ''}-{hi or ''}"
|
||||||
|
parts.append(f"{pp}{suffix}")
|
||||||
|
else:
|
||||||
|
parts.append(pp)
|
||||||
|
if salary_min_k:
|
||||||
|
parts.append(f"salary{salary_min_k}")
|
||||||
|
return "_".join(parts) or "all"
|
||||||
|
|
||||||
|
|
||||||
|
def make_output_path(
|
||||||
|
out: str | None,
|
||||||
|
out_dir: str,
|
||||||
|
location: str | None,
|
||||||
|
series: tuple,
|
||||||
|
pay_plans: tuple,
|
||||||
|
grade_min: int | None,
|
||||||
|
grade_max: int | None,
|
||||||
|
salary_min_k: int | None,
|
||||||
|
) -> Path:
|
||||||
|
if out:
|
||||||
|
return Path(out)
|
||||||
|
exports = Path(out_dir)
|
||||||
|
exports.mkdir(parents=True, exist_ok=True)
|
||||||
|
loc_slug = _location_slug(location or "")
|
||||||
|
filt_slug = _filters_slug(series, pay_plans, grade_min, grade_max, salary_min_k)
|
||||||
|
ts = datetime.now().strftime("%Y%m%d-%H%M")
|
||||||
|
return exports / f"usajobs_{loc_slug}_{filt_slug}_{ts}.org"
|
||||||
|
|
||||||
|
|
||||||
|
def export_org(jobs: list[dict], path: Path) -> None:
|
||||||
|
lines: list[str] = []
|
||||||
|
for job in jobs:
|
||||||
|
title = _shorten_title(job["title"])
|
||||||
|
url = job["url"] or ""
|
||||||
|
grade = _fmt_grade(job["pay_plan"], job["low_grade"], job["high_grade"])
|
||||||
|
salary = _fmt_salary(job["salary_min"], job["salary_max"])
|
||||||
|
|
||||||
|
lines += [
|
||||||
|
f"** {title} [[{url}][link]]",
|
||||||
|
":properties:",
|
||||||
|
f":agency: {job['agency'] or 'unknown'}",
|
||||||
|
f":grade: {grade}",
|
||||||
|
f":close_date: {job['close_date'] or 'unknown'}",
|
||||||
|
":end:",
|
||||||
|
"",
|
||||||
|
f"salary: {salary}",
|
||||||
|
f"location: {job['location'] or 'unknown'}",
|
||||||
|
f"travel: {job['travel'] or 'unknown'}",
|
||||||
|
f"clearance: {job['clearance'] or 'unknown'}",
|
||||||
|
"",
|
||||||
|
"*** posting",
|
||||||
|
job["raw_posting_text"] or "",
|
||||||
|
"",
|
||||||
|
]
|
||||||
|
path.write_text("\n".join(lines), encoding="utf-8")
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# cli
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
@click.group()
|
||||||
|
def cli() -> None:
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
@cli.command()
|
||||||
|
@click.option("--location", default=None, help="Location name (e.g. 'Washington, DC')")
|
||||||
|
@click.option("--radius", default=None, type=int, help="Search radius in miles")
|
||||||
|
@click.option("--series", multiple=True, help="Occupational series code, repeatable")
|
||||||
|
@click.option("--clearance", multiple=True, help="Clearance level code, repeatable")
|
||||||
|
@click.option("--pay-plan", "pay_plans", multiple=True, default=("GS", "GG"), show_default=True)
|
||||||
|
@click.option("--grade-min", default=None, type=int, help="Min grade (local filter)")
|
||||||
|
@click.option("--grade-max", default=None, type=int, help="Max grade (local filter)")
|
||||||
|
@click.option("--salary-min", "salary_min_k", default=None, type=int,
|
||||||
|
help="Min salary in thousands, e.g. 150 = $150,000 (local filter)")
|
||||||
|
@click.option("--limit", default=100, show_default=True, help="Max jobs to fetch")
|
||||||
|
@click.option("--out-dir", default="exports", show_default=True)
|
||||||
|
@click.option("--out", default=None, help="Explicit output path (overrides --out-dir)")
|
||||||
|
@click.option("--cache-dir", default=".cache/usajobs", show_default=True)
|
||||||
|
@click.option("--interactive/--no-interactive", default=True, show_default=True)
|
||||||
|
@click.option("--select-all", is_flag=True, help="Preselect all jobs in picker")
|
||||||
|
@click.option("--dry-run", is_flag=True, help="Show export list without writing")
|
||||||
|
@click.option("--offline", is_flag=True, help="Read from cache only, no network")
|
||||||
|
@click.option("--debug", is_flag=True, help="Print params and filter counts")
|
||||||
|
def search(
|
||||||
|
location, radius, series, clearance, pay_plans,
|
||||||
|
grade_min, grade_max, salary_min_k,
|
||||||
|
limit, out_dir, out, cache_dir,
|
||||||
|
interactive, select_all, dry_run, offline, debug,
|
||||||
|
) -> None:
|
||||||
|
credentials = get_credentials()
|
||||||
|
params = build_params(location, radius, series, clearance, pay_plans)
|
||||||
|
|
||||||
|
if debug:
|
||||||
|
click.echo(f"[debug] api params: {json.dumps(params, indent=2)}")
|
||||||
|
|
||||||
|
raw_jobs = fetch_all(params, limit, credentials, Path(cache_dir), offline, debug)
|
||||||
|
jobs = [normalize_job(r) for r in raw_jobs]
|
||||||
|
|
||||||
|
if debug:
|
||||||
|
click.echo(f"[debug] before local filter: {len(jobs)}")
|
||||||
|
|
||||||
|
jobs = [j for j in jobs if passes_filters(j, pay_plans, grade_min, grade_max, salary_min_k, location)]
|
||||||
|
|
||||||
|
if debug:
|
||||||
|
click.echo(f"[debug] after local filter: {len(jobs)}")
|
||||||
|
|
||||||
|
render_table(jobs)
|
||||||
|
|
||||||
|
if not jobs:
|
||||||
|
return
|
||||||
|
|
||||||
|
if not interactive:
|
||||||
|
selected = jobs
|
||||||
|
else:
|
||||||
|
selected = choose_jobs(jobs, select_all=select_all)
|
||||||
|
|
||||||
|
if not selected:
|
||||||
|
click.echo("Nothing selected. Exiting without writing.")
|
||||||
|
return
|
||||||
|
|
||||||
|
if dry_run:
|
||||||
|
click.echo(f"[dry-run] would export {len(selected)} job(s):")
|
||||||
|
for j in selected:
|
||||||
|
click.echo(f" {_trunc(j['title'], 70)} — {j['agency']}")
|
||||||
|
return
|
||||||
|
|
||||||
|
path = make_output_path(out, out_dir, location, series, pay_plans, grade_min, grade_max, salary_min_k)
|
||||||
|
export_org(selected, path)
|
||||||
|
click.echo(f"Exported {len(selected)} job(s) -> {path}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
cli()
|
||||||
Reference in New Issue
Block a user