199 lines
5.6 KiB
Markdown
199 lines
5.6 KiB
Markdown
`usajobs.py` is a python tui for exploring data.usajobs.gov
|
|
|
|
## goal:
|
|
- query the official usajobs api
|
|
- apply strict local filters because usajobs search facets are unreliable
|
|
- show results in a readable terminal table
|
|
- let me interactively mark/unmark jobs for export
|
|
- export selected jobs to org-mode
|
|
|
|
## stack:
|
|
- python 3.11+
|
|
- click for cli args
|
|
- requests for api
|
|
- rich for table output
|
|
- questionary for v1 interactive row marking/export
|
|
- pathlib/json/csv/stdlib otherwise
|
|
- do not use typer or pick for v1
|
|
- stretch/v2: textual tui after the simple questionary flow works
|
|
|
|
## env vars:
|
|
- USAJOBS_EMAIL
|
|
- USAJOBS_KEY
|
|
|
|
### Run:
|
|
`python usajobs.py search --location "Washington, DC" --radius 25 --salary-min 150 --grade-min 15 --grade-max 15 --series 2210 --series 0340 --clearance 3 --clearance 4`
|
|
|
|
### option behavior:
|
|
- --radius 25 means 25 miles
|
|
- --salary-min 150 means $150,000
|
|
- --grade-min/--grade-max filter locally against low/high grade
|
|
- --series may repeat; pass to api as semicolon list
|
|
- --clearance may repeat; pass to api as semicolon list
|
|
- --pay-plan may repeat, default gs and gg
|
|
- --limit defaults 100
|
|
- --out-dir defaults exports
|
|
- --out optional explicit org output path
|
|
- --cache-dir defaults .cache/usajobs
|
|
- --interactive / --no-interactive, default true
|
|
- --select-all preselects every row in the export picker
|
|
- --dry-run shows what would be exported without writing
|
|
- --offline reads cached json only and does not call api
|
|
- --debug prints api params and counts before/after filtering
|
|
|
|
search behavior:
|
|
1. call https://data.usajobs.gov/api/search using official headers:
|
|
- host: data.usajobs.gov
|
|
- user-agent: $USAJOBS_EMAIL
|
|
- authorization-key: $USAJOBS_KEY
|
|
2. request:
|
|
- fields=full
|
|
- resultsperpage=500
|
|
- sortfield=opendate
|
|
- sortdirection=desc
|
|
3. cache raw json response per query/page under .cache/usajobs
|
|
4. apply local filters after fetching:
|
|
- pay plan in allowed pay plans
|
|
- low_grade >= grade_min
|
|
- high_grade <= grade_max
|
|
- salary max >= salary_min, or salary min >= salary_min if max absent
|
|
- location string contains/near requested location as available
|
|
5. output rich table with columns:
|
|
- idx
|
|
- title
|
|
- agency
|
|
- grade
|
|
- salary
|
|
- location
|
|
- close date
|
|
- clearance match
|
|
- url
|
|
|
|
v1 selection/export behavior:
|
|
1. render the rich table of filtered jobs
|
|
2. below the table, open a questionary checkbox prompt:
|
|
"mark jobs to export"
|
|
3. each checkbox choice should be a compact one-line label:
|
|
"[12] dia | gg-15 | $167k-$191k | washington dc | information technology..."
|
|
4. value should be stable job id / document id, not table index
|
|
5. preselect nothing by default unless --select-all is passed
|
|
6. user toggles rows with space, navigates with arrows or j/k, confirms with enter
|
|
7. after confirmation, export checked jobs to org
|
|
8. selecting nothing exits without writing
|
|
9. ctrl-c cancels cleanly
|
|
|
|
questionary defaults/keys:
|
|
- arrows navigate
|
|
- j/k navigate
|
|
- ctrl-n/ctrl-p navigate
|
|
- space toggles mark/unmark
|
|
- enter confirms/export
|
|
- ctrl-c cancels
|
|
- prompt instruction should say:
|
|
"space=mark/unmark, enter=export, ctrl-c=cancel"
|
|
|
|
export naming:
|
|
- if --out is absent, create a new timestamped file:
|
|
exports/usajobs_<location-slug>_<filters-slug>_<yyyymmdd-hhmm>.org
|
|
- example:
|
|
exports/usajobs_washington-dc_2210-0340_gs15_salary150_20260518-1412.org
|
|
|
|
org output format:
|
|
```
|
|
** <shortened job title> [[url][link]]
|
|
:properties:
|
|
:agency: <agency>
|
|
:grade: <pay plan> <low>-<high>
|
|
:close_date: <date>
|
|
:end:
|
|
|
|
salary: <salary range>
|
|
location: <location>
|
|
travel: <travel percentage or unknown>
|
|
clearance: <clearance/security text or unknown>
|
|
|
|
*** posting
|
|
<raw posting text>
|
|
```
|
|
|
|
## implementation notes:
|
|
- write clean functional code:
|
|
- build_params()
|
|
- fetch_page()
|
|
- fetch_all()
|
|
- normalize_job()
|
|
- passes_filters()
|
|
- render_table()
|
|
- compact_job_label()
|
|
- choose_jobs()
|
|
- export_org()
|
|
- make_output_path()
|
|
- normalize both official api shape and frontend-ish shape if present:
|
|
- api jobs may use MatchedObjectDescriptor
|
|
- details may be under UserArea.Details
|
|
- normalized job dict should include:
|
|
- document_id
|
|
- title
|
|
- agency
|
|
- department
|
|
- pay_plan
|
|
- low_grade
|
|
- high_grade
|
|
- salary_min
|
|
- salary_max
|
|
- location
|
|
- close_date
|
|
- travel
|
|
- clearance
|
|
- clearance_text_match
|
|
- url
|
|
- raw_posting_text
|
|
- raw posting text should combine:
|
|
- title
|
|
- summary
|
|
- duties
|
|
- requirements
|
|
- qualifications
|
|
- evaluations
|
|
- other information
|
|
- key requirements
|
|
- shortened job title should be max 80 chars
|
|
- compact selection row title should fit reasonably in 120 cols
|
|
- compact row should include idx, agency, grade, salary, location, title
|
|
- truncate title to about 55 chars in the picker
|
|
- avoid putting full url in questionary prompt; keep url in rich table and org output
|
|
- include helpful errors if env vars are missing
|
|
- never mutate cached raw results
|
|
|
|
selection implementation hint:
|
|
|
|
```python
|
|
import questionary
|
|
from questionary import Choice
|
|
|
|
def choose_jobs(jobs: list[dict], select_all: bool = False) -> list[dict]:
|
|
by_id = {job["document_id"]: job for job in jobs}
|
|
|
|
choices = [
|
|
Choice(
|
|
title=compact_job_label(job, idx),
|
|
value=job["document_id"],
|
|
checked=select_all,
|
|
)
|
|
for idx, job in enumerate(jobs, start=1)
|
|
]
|
|
|
|
selected_ids = questionary.checkbox(
|
|
"mark jobs to export",
|
|
choices=choices,
|
|
instruction="space=mark/unmark, enter=export, ctrl-c=cancel",
|
|
use_jk_keys=True,
|
|
use_emacs_keys=True,
|
|
).ask()
|
|
|
|
if not selected_ids:
|
|
return []
|
|
|
|
return [by_id[job_id] for job_id in selected_ids]
|
|
```
|