JobWorkFlow is a local-first, self-hosted job search operations system built around an MCP server. It keeps your job pipeline in your own SQLite database and local files, and exposes deterministic tools for agents.
- Ingest jobs from JobSpy into SQLite (
data/capture/jobs.db) - Read and triage
newjobs in batches - Persist status transitions atomically
- Generate tracker notes for shortlisted jobs
- Enforce resume artifact guardrails before marking completion
- Finalize completion with DB audit fields and tracker sync
- SSOT: database status in SQLite
- Projection: tracker markdown files for Obsidian workflows
- Execution: MCP tools
- Policy: agent prompts/skills
This means decisions and automation should be driven by DB status; trackers are synchronized views.
- Ingest jobs (MCP):
- Use
scrape_jobs(scrapes + normalizes + inserts with idempotent dedupe) - Result: jobs inserted into
data/capture/jobs.dbwithstatus='new'
- Use
- Read queue via
bulk_read_new_jobs. - Triage and write status via
bulk_update_job_status. - Create tracker/workspace scaffolding via
initialize_shortlist_trackersforstatus=shortlist. - Run
career_tailorto batch-generate tailoring artifacts (ai_context.md,resume.tex,resume.pdf) for shortlist trackers. - Commit completion via
finalize_resume_batch:- DB ->
resume_written+ audit fields - tracker frontmatter ->
Resume Written - on sync failure -> fallback DB status
reviewedwithlast_error
- DB ->
- Implemented:
scrape_jobs(ingestion: scrape + normalize + insert with dedupe)bulk_read_new_jobsbulk_update_job_statusinitialize_shortlist_trackerscareer_tailorupdate_tracker_statusfinalize_resume_batch
newshortlistreviewedrejectresume_writtenapplied
Typical transitions:
new -> shortlist | reviewed | rejectshortlist -> resume_writtenshortlist -> reviewed(failure/retry path)resume_written -> applied
Reviewed -> Resume Written -> Applied- Terminal outcomes include
Rejected,Ghosted,Interview,Offer
JobWorkFlow/
├── mcp-server-python/ # MCP server implementation
│ ├── server.py # FastMCP entrypoint
│ ├── tools/ # Tool handlers
│ ├── db/ # SQLite read/write layers
│ ├── utils/ # Validation, parser, sync, file ops
│ ├── models/ # Error and schema models
│ └── tests/ # Test suite
├── skills/ # Project-owned Codex skills
│ ├── job-pipeline-intake/ # scrape/read/triage/status/tracker init policy
│ └── career-tailor-finalize/ # tailoring + artifact guardrails + finalize policy
├── scripts/ # Operational helper scripts
├── data/ # Local data (DB, templates, artifacts)
├── trackers/ # Tracker markdown notes
├── .kiro/specs/ # Feature specs and task breakdowns
└── README.md
- Python 3.11+
- uv for dependency and task execution
- SQLite
- Optional: LaTeX toolchain (
pdflatex) for resume compilation
uv sync --all-groupsUse the scrape_jobs MCP tool for integrated scrape + ingest:
# Example MCP call (via agent or client)
scrape_jobs({
"terms": ["backend engineer", "machine learning engineer"],
"location": "Ontario, Canada",
"results_wanted": 20,
"hours_old": 2,
"save_capture_json": true
})From repo root:
./scripts/run_mcp_server.shOr directly:
cd mcp-server-python
./start_server.shEnvironment variables:
JOBWORKFLOW_ROOT: base root for data path resolutionJOBWORKFLOW_DB: explicit DB path overrideJOBWORKFLOW_LOG_LEVEL:DEBUG|INFO|WARNING|ERRORJOBWORKFLOW_LOG_FILE: optional file log pathJOBWORKFLOW_SERVER_NAME: MCP server name (defaultjobworkflow-mcp-server)
Example:
export JOBWORKFLOW_DB=data/capture/jobs.db
export JOBWORKFLOW_LOG_LEVEL=DEBUG
export JOBWORKFLOW_LOG_FILE=logs/mcp-server.log
./scripts/run_mcp_server.shSee .env.example and mcp-server-python/mcp-config-example.json for templates.
- Purpose: ingest jobs from external sources (JobSpy-backed) into SQLite
- Key args:
terms,location,sites,results_wanted,hours_old,db_path,dry_run - Behavior: scrapes sources, normalizes records, inserts with idempotent dedupe by URL, returns structured run metrics
- Boundary: ingestion only (inserts
status='new'; no triage/tracker/finalize side effects)
- Purpose: read
status='new'jobs in deterministic batches - Key args:
limit,cursor,db_path - Behavior: read-only, cursor pagination
- Purpose: atomic batch status updates
- Key args:
updates[],db_path - Behavior: validates IDs/statuses, all-or-nothing write
- Purpose: create trackers from shortlisted jobs
- Key args:
limit,db_path,trackers_dir,force,dry_run - Behavior: idempotent by default, deterministic filenames, atomic file writes, compatibility dedupe by
reference_linkto avoid legacy duplicate trackers
- Purpose: batch full-tailor tracker items into resume artifacts
- Key args:
items[],force,full_resume_path,resume_template_path,applications_dir,pdflatex_cmd - Behavior: per item does tracker parse + workspace bootstrap +
ai_context.mdregeneration + LaTeX compile, returnssuccessful_itemsfor downstreamfinalize_resume_batch - Boundary: artifact-focused only; no DB status writes and no tracker status writes
- Purpose: update tracker frontmatter status safely
- Key args:
tracker_path,target_status,dry_run,force - Behavior: transition policy + Resume Written guardrails
- Purpose: commit completion state after resume compile succeeds
- Key args:
items[],run_id,db_path,dry_run - Behavior:
- validates tracker/artifacts/placeholders per item
- writes DB completion fields
- syncs tracker status
- fallback to
reviewedwithlast_erroron sync failure
For full contracts and examples, see mcp-server-python/README.md.
Use a single end-to-end execution prompt from:
docs/pipeline-prompt.md(versioned, copy-paste ready full workflow prompt)
Project skills live in this repo under:
skills/job-pipeline-intake/SKILL.mdskills/career-tailor-finalize/SKILL.md
Recommended runtime setup is to expose these repo skills through your Codex skills directory:
ln -s /Users/nd/Developer/JobWorkFlow/skills/job-pipeline-intake /Users/nd/.codex/skills/job-pipeline-intake
ln -s /Users/nd/Developer/JobWorkFlow/skills/career-tailor-finalize /Users/nd/.codex/skills/career-tailor-finalizeThis keeps skills versioned in Git while making them available as first-class skills in Codex.
uv run pytest -quv run ruff check .
uv run ruff format . --checkuv run pre-commit run --all-filesCI workflow mirrors these checks: .github/workflows/ci.yml.
- Server docs:
mcp-server-python/README.md - Deployment:
mcp-server-python/DEPLOYMENT.md - Quickstart:
mcp-server-python/QUICKSTART.md - Pipeline prompt:
docs/pipeline-prompt.md - Testing notes:
TESTING.md - Specs:
.kiro/specs/
- DB not found:
- verify
JOBWORKFLOW_DBand file existence - ensure
scrape_jobsingestion ran successfully
- verify
- Tool returns validation errors:
- check request payload shape and allowed status values
- Resume Written blocked:
- confirm
resume.pdfexists and non-zero - confirm
resume.texexists and placeholder tokens are fully replaced
- confirm