diff --git a/.agent/workflows/README.md b/.agent/workflows/README.md new file mode 100644 index 0000000..9b18279 --- /dev/null +++ b/.agent/workflows/README.md @@ -0,0 +1,34 @@ +# Job Workflow Orchestration + +This directory contains workflow definitions for the job application pipeline. + +## Entrypoint + +Use the merged workflow as the primary entrypoint: + +- `intake-pipeline.md` (end-to-end: intake + tailor + finalize) + +Legacy file: + +- `tailor-finalize.md` (deprecated shim; points to `intake-pipeline.md`) + +## Skills Used + +- `job-matching-expertise` (classification stage) +- `resume-crafting-expertise` (resume bullet stage) + +## Tools Used + +- `scrape_jobs` +- `bulk_read_new_jobs` +- `bulk_update_job_status` +- `initialize_shortlist_trackers` +- `career_tailor` +- `finalize_resume_batch` + +## Quick Start + +```bash +# Open and execute the merged workflow +/intake-pipeline +``` diff --git a/.agent/workflows/intake-pipeline.md b/.agent/workflows/intake-pipeline.md new file mode 100644 index 0000000..02393a7 --- /dev/null +++ b/.agent/workflows/intake-pipeline.md @@ -0,0 +1,203 @@ +--- +description: "End-to-end job pipeline: intake + tailor + finalize" +--- + +# Workflow: End-to-End Job Pipeline + +## Goal + +Run the full pipeline in one document: + +1. Scrape jobs +2. Read new jobs +3. Classify jobs +4. Update job statuses +5. Initialize shortlist trackers +6. Collect shortlist trackers +7. Bootstrap resume workspace +8. Fill resume bullets +9. Compile PDFs +10. Finalize DB + tracker status + +--- + +## Prerequisites + +- MCP server running (`JOBWORKFLOW_DB`/`JOBWORKFLOW_ROOT` configured) +- Full resume exists (`JOBWORKFLOW_FULL_RESUME_PATH`) +- Resume template exists (`JOBWORKFLOW_RESUME_TEMPLATE_PATH`) +- Trackers dir configured (`JOBWORKFLOW_TRACKERS_DIR`, default `trackers/`) +- Skills available: + - `job-matching-expertise` (Step 3) + - `resume-crafting-expertise` (Step 8) + +--- + +## Stage A: Intake + +### Step 1: Scrape Jobs + +**MCP Tool**: `scrape_jobs` + +```python +scrape_jobs(dry_run=False) +``` + +**SUCCESS CRITERIA**: `inserted > 0` OR `duplicate > 0` + +### Step 2: Read New Jobs Queue + +**MCP Tool**: `bulk_read_new_jobs` + +```python +result_read = bulk_read_new_jobs() +jobs = result_read["jobs"] +``` + +**SUCCESS CRITERIA**: `len(jobs) > 0` + +### Step 3: Classify Jobs + +**Reference Skill**: `job-matching-expertise` +**Mandatory**: load and apply this skill before classification. + +For each job, produce `classification` in `shortlist | reviewed | reject`. + +### Step 4: Update Job Statuses + +**MCP Tool**: `bulk_update_job_status` + +```python +updates = [ + {"id": job["id"], "status": classification} + for job, (classification, reason) in zip(jobs, classifications) +] + +bulk_update_job_status(updates=updates) +``` + +**SUCCESS CRITERIA**: `updated_count > 0` + +### Step 5: Initialize Shortlist Trackers + +**MCP Tool**: `initialize_shortlist_trackers` + +```python +initialize_shortlist_trackers(force=False, dry_run=False) +``` + +**SUCCESS CRITERIA**: `created_count > 0` OR `skipped_count > 0` + +--- + +## Stage B: Tailor and Finalize + +```python +# Important: these two passes must use different force settings +bootstrap_force = True +compile_force = False +``` + +### Step 6: Collect Shortlist Trackers + +```bash +trackers_dir="${JOBWORKFLOW_TRACKERS_DIR:-trackers}" +trackers=$(find "$trackers_dir" -name "*.md" -type f -print0 | \ + xargs -0 grep -Eil "status:[[:space:]]*(shortlist|reviewed)" | \ + head -10) + +items=$(echo "$trackers" | jq -R -s -c 'split("\n")[:-1] | map({tracker_path: .})') +``` + +**SUCCESS CRITERIA**: `len(items) > 0` + +### Step 7: Bootstrap Resume Workspace + +**MCP Tool**: `career_tailor` + +```python +result_bootstrap = career_tailor(items=items, force=bootstrap_force) +``` + +**SUCCESS CRITERIA**: each item has `resume_tex_path` and `ai_context_path` for Step 8 editing. + +### Step 8: Fill Resume Bullets + +**Reference Skill**: `resume-crafting-expertise` +**Mandatory**: load and apply this skill before editing any `resume.tex`. + +For each successful item from Step 7: + +1. Open `resume_tex_path` +2. Use `ai_context_path` + full resume facts +3. Replace tokens matching `*-BULLET-POINT-*` +4. Keep LaTeX structure unchanged + +Hard guardrails (must pass): + +1. Section-scoped grounding (no cross-section fact drift): + - `Project Experience` bullets: only from project facts in `full_resume.md` / `ai_context.md`. + - `Qishu Data ... Machine Learning Engineer Intern` bullets: only from internship facts. + - `University of Waterloo ... Researcher (...)` bullets: only from Waterloo research facts. +2. No fabricated claims: every bullet must be traceable to `ai_context.md`. +3. No duplicate bullets in one resume: exact duplicate `\\resumeItem{...}` text is forbidden. +4. If any guardrail fails for one resume, do not proceed that resume to Step 9. + +Validation: + +```bash +find data/applications -name "resume.tex" | xargs grep -n "BULLET-POINT" +``` + +Duplicate check: + +```bash +for f in data/applications/*/resume/resume.tex; do + dups=$(rg -o '\\\\resumeItem\\{.*\\}' "$f" | sort | uniq -d) + if [ -n "$dups" ]; then + echo "Duplicate bullets in $f" + echo "$dups" + fi +done +``` + +**SUCCESS CRITERIA**: + +- no `BULLET-POINT` matches +- duplicate check returns empty +- all edited bullets are section-consistent with `ai_context.md` + +### Step 9: Compile PDFs + +**MCP Tool**: `career_tailor` (second pass) + +```python +result_compile = career_tailor(items=items, force=compile_force) +``` + +Use only `result_compile["successful_items"]` in the next step. + +**SUCCESS CRITERIA**: `result_compile["success_count"] > 0` + +### Step 10: Finalize Database and Trackers + +**MCP Tool**: `finalize_resume_batch` + +```python +result_finalize = finalize_resume_batch( + items=result_compile["successful_items"], + dry_run=False, +) +``` + +**SUCCESS CRITERIA**: `result_finalize["success_count"] > 0` + +--- + +## Workflow Completion Checklist + +- [ ] Step 1-5 intake completed +- [ ] Step 6-10 tailor/finalize completed +- [ ] All placeholders removed +- [ ] PDFs compiled successfully +- [ ] Finalization committed to DB + tracker diff --git a/.gitignore b/.gitignore index 98b5726..ced9f03 100644 --- a/.gitignore +++ b/.gitignore @@ -35,6 +35,9 @@ htmlcov/ .obsidian/workspace.json .obsidian/cache +# Local editor settings +.vscode/ + # macOS clutter .DS_Store @@ -66,4 +69,4 @@ data/templates/full_resume.md data/templates/resume_skeleton.tex trackers/*.md !trackers/template.md -!trackers/Job Application.md +!trackers/job-application.md diff --git a/.kiro/specs/refactor-job-status-enum/design.md b/.kiro/specs/refactor-job-status-enum/design.md new file mode 100644 index 0000000..68056a0 --- /dev/null +++ b/.kiro/specs/refactor-job-status-enum/design.md @@ -0,0 +1,114 @@ +# Design Document: Refactor Job Status to Enum + +## Overview + +This design outlines the migration from using hardcoded 'magic strings' for job statuses to centralized, type-safe Enums. + +This refactoring will significantly improve code clarity, reduce the risk of typo-related bugs, and enhance maintainability by creating a single source of truth for status definitions. + +## Scope + +**In scope:** + +* Refactoring all Python code (`.py` files) within the `mcp-server-python` directory to replace hardcoded status strings with Enum members. +* Defining two distinct Enums for the two status systems (database vs. tracker). +* Updating all associated tests to use the new Enums. + +**Out of scope:** + +* Changes to the database schema itself. +* Changes to the semantic meaning or lifecycle of existing statuses. +* Modifying any frontend or external client-side logic that consumes statuses. The API contract will continue to accept and return plain strings. + +## Current State Summary + +The codebase currently contains two separate and inconsistent systems for managing job statuses, both relying on hardcoded strings: + +1. **Database Statuses:** A set of lowercase strings (`new`, `shortlist`, `reviewed`, etc.) used in the `jobs` database and related data access logic. +2. **Tracker Statuses:** A set of capitalized strings (`Reviewed`, `Resume Written`, etc.) used in the frontmatter of Markdown tracker files and the business logic that governs them. + +This duplication and inconsistency is spread across numerous files, including `utils/validation.py`, `db/*.py`, `utils/tracker_policy.py`, and the entire `tests/` suite, making the code brittle and difficult to maintain. + +## Target Architecture + +### 1) Centralized Enum Definitions + +A new file will be introduced to act as the single source of truth for all status definitions: + +* **File:** `mcp-server-python/models/status.py` + +This file will contain two distinct Enum classes: + +```python +from enum import Enum + +class JobDbStatus(str, Enum): + """Enum for statuses used in the 'jobs' database table.""" + NEW = "new" + SHORTLIST = "shortlist" + REVIEWED = "reviewed" + REJECT = "reject" + RESUME_WRITTEN = "resume_written" + APPLIED = "applied" + +class JobTrackerStatus(str, Enum): + """Enum for statuses used in the frontmatter of Markdown tracker files.""" + REVIEWED = "Reviewed" + RESUME_WRITTEN = "Resume Written" + APPLIED = "Applied" + INTERVIEW = "Interview" + OFFER = "Offer" + REJECTED = "Rejected" + GHOSTED = "Ghosted" +``` +Inheriting from `(str, Enum)` ensures that the Enums are compatible with string operations and can be easily serialized to JSON/string format at the API boundaries. + +### 2) Refactoring Pattern + +All application logic will be updated to import and use these Enums. + +**Example (Before):** +```python +# in db/jobs_reader.py +def query_new_jobs(conn): + return conn.execute("SELECT * FROM jobs WHERE status = 'new'") +``` + +**Example (After):** +```python +# in db/jobs_reader.py +from models.status import JobDbStatus + +def query_new_jobs(conn): + return conn.execute("SELECT * FROM jobs WHERE status = ?", (JobDbStatus.NEW,)) +``` + +Pydantic models used at the API boundary will automatically handle the conversion between incoming strings and the internal Enum types, preserving the external contract. + +## Migration Phases + +### Phase 1: Foundation +- Create the `status.py` file and define the `JobDbStatus` and `JobTrackerStatus` Enums. + +### Phase 2: Core Logic +- Refactor `utils/validation.py` to use the new Enums, removing the hardcoded `ALLOWED_STATUSES` lists. +- Refactor `utils/tracker_policy.py` and the database layer (`db/*.py`). + +### Phase 3: Tools and API +- Refactor all scripts in the `tools/` directory. +- Update `server.py` and any related API documentation or examples. + +### Phase 4: Tests +- Systematically update the entire `tests/` suite to use the Enums for test setup and assertions. This is the largest phase. + +### Phase 5: Verification & Cleanup +- Perform a final, full-codebase search for any remaining hardcoded strings to ensure none were missed. +- Ensure all tests pass and the application functions correctly. + +## Risks and Mitigations + +1. **Risk:** A hardcoded status string is missed during refactoring. + * **Mitigation:** The final verification phase (Phase 5) involves using `grep` or a similar search tool to comprehensively scan for remaining instances. A passing test suite is the primary gate. + +2. **Risk:** Confusion between `JobDbStatus` and `JobTrackerStatus` during development. + * **Mitigation:** The clear and explicit naming of the Enums is designed to prevent this. Code reviews should pay special attention to ensuring the correct Enum is used in the correct context. diff --git a/.kiro/specs/refactor-job-status-enum/requirements.md b/.kiro/specs/refactor-job-status-enum/requirements.md new file mode 100644 index 0000000..dd20eaa --- /dev/null +++ b/.kiro/specs/refactor-job-status-enum/requirements.md @@ -0,0 +1,47 @@ +# Requirements Document: Refactor Job Status to Enum + +## Introduction + +This specification defines the requirements for refactoring hardcoded job status strings into type-safe, centralized Enum definitions. The goal is to improve code quality, maintainability, and robustness without altering the application's external behavior. + +## Glossary + +- **Enum**: An enumeration; a set of symbolic names bound to unique, constant values. +- **Magic String**: A hardcoded string literal used in application logic without a clear explanation or centralized definition. +- **Behavior Parity**: The principle that the application's observable behavior must remain identical before and after the refactoring. +- **Single Source of Truth**: The practice of structuring information models so that every data element is stored exactly once. + +## Requirements + +### Requirement 1: Centralized and Type-Safe Status Definitions + +**User Story:** As a developer, I want job statuses defined as Enums in a central location, so that I have a single source of truth and can avoid magic strings. + +#### Acceptance Criteria + +1. THE system SHALL define a `JobDbStatus` Enum containing all valid statuses for the database. +2. THE system SHALL define a `JobTrackerStatus` Enum containing all valid statuses for Markdown tracker files. +3. THESE Enums SHALL be located in a single new file at `mcp-server-python/models/status.py`. +4. THE `ALLOWED_STATUSES` and `ALLOWED_TRACKER_STATUSES` lists in `utils/validation.py` SHALL be removed and their logic replaced by the Enums. + +### Requirement 2: Comprehensive Codebase Refactoring + +**User Story:** As a maintainer, I want all hardcoded status strings in the Python code replaced with Enum members, so that the code is more readable and less prone to bugs from typos. + +#### Acceptance Criteria + +1. THE system SHALL use the `JobDbStatus` Enum in all database-related logic, including queries and updates within the `db/` directory. +2. THE system SHALL use the `JobTrackerStatus` Enum in all tracker-related business logic, primarily `utils/tracker_policy.py`. +3. THE system SHALL update all tool scripts in the `tools/` directory to use the appropriate Enums. +4. THE system SHALL update all tests in the `tests/` directory to use the Enums for data setup and assertions. +5. A codebase search for the old hardcoded status strings (e.g., `'shortlist'`, `'Reviewed'`) within `.py` files SHALL yield no results in application logic after the refactoring is complete. + +### Requirement 3: Behavior and API Contract Parity + +**User Story:** As an operator and API consumer, I want the application's behavior and API contract to be unchanged after the refactoring, so that existing workflows and clients are not broken. + +#### Acceptance Criteria + +1. WHEN a tool is invoked via the API, ITS observable behavior and final output concerning job statuses SHALL be identical to the pre-refactor behavior. +2. THE system's external API contract SHALL remain the same. Specifically, API endpoints will continue to accept and return statuses as plain strings. Pydantic or a similar layer will handle the conversion to/from internal Enums. +3. ALL existing automated tests (`uv run pytest`) SHALL pass after the refactoring is complete. diff --git a/.kiro/specs/refactor-job-status-enum/tasks.md b/.kiro/specs/refactor-job-status-enum/tasks.md new file mode 100644 index 0000000..c7e4b43 --- /dev/null +++ b/.kiro/specs/refactor-job-status-enum/tasks.md @@ -0,0 +1,63 @@ +# Implementation Plan: Refactor Job Status to Enum + +## Overview + +This plan provides a detailed checklist for migrating from hardcoded status strings to centralized Enums, based on the associated design and requirements documents. + +## Tasks + +- [x] **1. Foundation** + - [x] 1.1. Create the new file `mcp-server-python/models/status.py`. + - [x] 1.2. In `status.py`, define the `JobDbStatus(str, Enum)` with all lowercase database statuses. + - [x] 1.3. In `status.py`, define the `JobTrackerStatus(str, Enum)` with all capitalized tracker file statuses. + - _Requirements: 1.1, 1.2, 1.3_ + +- [x] **2. Refactor Core Definitions & Logic** + - [x] 2.1. Refactor `mcp-server-python/utils/validation.py`: + - [x] 2.1.1. Import the new Enums. + - [x] 2.1.2. Remove the `ALLOWED_STATUSES` and `ALLOWED_TRACKER_STATUSES` lists (replaced with Enum-derived sets). + - [x] 2.1.3. Update the `validate_status` and `validate_tracker_status` functions to validate against the Enums. + - [x] 2.2. Refactor `mcp-server-python/utils/tracker_policy.py` to use `JobTrackerStatus` members instead of strings for defining transitions and terminal states. + - _Requirements: 1.4, 2.2_ + +- [x] **3. Refactor Database Layer** + - [x] 3.1. Refactor `mcp-server-python/db/jobs_ingest_writer.py` to use `JobDbStatus.NEW` for the default status. + - [x] 3.2. Refactor `mcp-server-python/db/jobs_reader.py` to use `JobDbStatus` members in all SQL queries. + - [x] 3.3. Refactor `mcp-server-python/db/jobs_writer.py` to use `JobDbStatus` members in all SQL queries. + - _Requirements: 2.1_ + +- [x] **4. Refactor Tools and API Layer** + - [x] 4.1. Systematically go through each script in `mcp-server-python/tools/` and replace any hardcoded status strings with the appropriate Enum. + - [x] 4.2. Review `mcp-server-python/server.py` and update any internal logic that references status strings (docstring examples left as human-readable strings per API contract parity). + - [x] 4.3. Refactor `mcp-server-python/utils/tracker_renderer.py` to use `JobTrackerStatus.REVIEWED` for initial tracker status. + - [x] 4.4. Refactor `mcp-server-python/utils/tracker_sync.py` to safely convert Enum members to plain strings before YAML serialization. + - [x] 4.5. Update `mcp-server-python/schemas/ingestion.py` to re-export `JobDbStatus` as `JobStatus` alias for backward compatibility. + - _Requirements: 2.3_ + +- [x] **5. Refactor Tests** + - [x] 5.1. Update `tests/test_validation.py` to assert against the new Enum-based logic. + - [x] 5.2. Update `tests/test_validation_scrape_jobs.py` to use Enum members. + - [x] 5.3. Update `tests/test_tracker_policy.py` to use `JobTrackerStatus` members. + - [x] 5.4. Update `tests/test_bulk_update_job_status.py` to use `JobDbStatus` members. + - [x] 5.5. Update `tests/test_bulk_read_new_jobs.py` to use `JobDbStatus` members. + - [x] 5.6. Update `tests/test_checkpoint_task4.py` to use `JobDbStatus` members. + - [x] 5.7. Update `tests/test_finalize_resume_batch.py` to use `JobDbStatus` and `JobTrackerStatus` members. + - [x] 5.8. Update `tests/test_initialize_shortlist_trackers_tool.py` to use `JobDbStatus` members. + - [x] 5.9. Update `tests/test_job_schema.py` to use `JobDbStatus` members. + - [x] 5.10. Update `tests/test_career_tailor.py` to use `JobDbStatus` members. + - [x] 5.11. Update `tests/test_bulk_update_job_status_schemas.py` to use `JobDbStatus` members. + - [x] 5.12. Update `tests/test_server_bulk_update_integration.py` to use `JobDbStatus` members. + - [x] 5.13. Update `tests/test_update_tracker_status_tool.py` to use `JobTrackerStatus` members. + - [x] 5.14. Update `tests/test_tracker_sync.py` to use `JobTrackerStatus` members. + - [x] 5.15. Update `tests/test_jobs_reader.py` to use `JobDbStatus` members. + - [x] 5.16. Update `tests/test_jobs_writer.py` to use `JobDbStatus` members. + - [x] 5.17. Update `tests/test_jobs_ingest_writer.py` to use `JobDbStatus` members. + - _Note: This was the largest set of changes._ + - _Requirements: 2.4, 3.3_ + +- [x] **6. Verification and Cleanup** + - [x] 6.1. Run the entire test suite (`uv run pytest`) and confirm that all tests pass (1272 passed, 2 skipped). + - [x] 6.2. Run `uv run ruff format .` and `uv run ruff check . --fix` with no issues. + - [x] 6.3. Perform a final, full-codebase search (`grep`) for any remaining hardcoded status strings in `.py` files — confirmed only enum definitions and docstrings remain. + - [x] 6.4. Mark this task as complete. + - _Requirements: 2.5, 3.1, 3.3_ \ No newline at end of file diff --git a/MCP_TOOLS_HANDBOOK.md b/MCP_TOOLS_HANDBOOK.md new file mode 100644 index 0000000..298cc6f --- /dev/null +++ b/MCP_TOOLS_HANDBOOK.md @@ -0,0 +1,295 @@ +# JobWorkFlow MCP 工具手册 + +本手册详细介绍了 JobWorkFlow 项目中 7 个核心 MCP 工具的功能、执行逻辑及数据流。这些工具共同协作,实现了从职位抓取到简历投递的全自动化流程。 + +--- + +## 工具流概览 + +1. **抓取** (`scrape_jobs`) → 存入数据库 `status='new'` +2. **读取** (`bulk_read_new_jobs`) → 获取待筛选队列 +3. **筛选** (`bulk_update_job_status`) → 更新状态 (`shortlist`/`reject`) +4. **初始化** (`initialize_shortlist_trackers`) → 生成 Obsidian 追踪文件与工作空间 +5. **定制** (`career_tailor`) → 生成 `ai_context.md` 并编译 `resume.pdf` +6. **完结** (`finalize_resume_batch`) → 终结流程,批量更新数据库审计字段并同步追踪文件状态 +7. **维护** (`update_tracker_status`) → 可选步骤,用于手动调整或自动化校验追踪文件状态 + +--- + +## Skill 与 MCP 工具协作模型 + +本项目采用"大脑 + 手脚"的混合编排架构: + +- **Skill(领域知识)**:纯推理与知识规范,不执行任何 I/O 操作。Agent 通过读取 `skills/xxx/SKILL.md` 来加载领域知识。 +- **MCP 工具(执行能力)**:负责数据库读写、文件操作、API 调用等实际执行动作。 + +两者协作流程: +1. Agent 读取 Skill 获取决策规则 +2. Agent 在内存中完成推理和决策 +3. Agent 调用 MCP 工具执行落库或文件写入 + +**项目中的两个核心 Skill:** +- `job-matching-expertise`:职位筛选的评估标准和决策框架 +- `resume-crafting-expertise`:简历定制的内容策略和质量守则 + +--- + +## 通用特性 + +### 预览模式 (dry_run) + +大多数写入型工具支持 `dry_run=true` 参数,可预览结果而不执行实际写入: +- `scrape_jobs`:预览将要插入的记录 +- `initialize_shortlist_trackers`:预览将要创建的 tracker 文件 +- `update_tracker_status`:预览状态变更 +- `finalize_resume_batch`:预览 finalization 结果 + +### 结构化错误响应 + +所有工具采用统一的错误响应格式: +```json +{ + "error": { + "code": "VALIDATION_ERROR | FILE_NOT_FOUND | DB_NOT_FOUND | INTERNAL_ERROR | COMPILE_ERROR", + "message": "人类可读的错误描述", + "retryable": true + } +} +``` + +| 错误码 | 含义 | 是否可重试 | +|--------|------|------------| +| `VALIDATION_ERROR` | 输入参数校验失败 | 否(需修正参数) | +| `FILE_NOT_FOUND` | 文件不存在 | 否(需创建文件) | +| `DB_NOT_FOUND` | 数据库文件不存在 | 否(需初始化 DB) | +| `INTERNAL_ERROR` | 未预期的内部错误 | 是 | +| `COMPILE_ERROR` | LaTeX 编译失败 | 视情况 | + +--- + +## 1. scrape_jobs (职位抓取) + +**工作流顺序**:第 1 步 (入口) + +执行外部源 (LinkedIn) 的职位搜索、数据清洗与去重入库。 + +**具体做什么**: +- 基于搜索词 (`terms`) 和地点 (`location`) 搜索职位。 +- 自动过滤掉没有 URL 或描述的无效记录。 +- 执行**幂等性去重**:如果职位 URL 已存在,则忽略;否则插入,默认状态为 `new`。 +- 支持 `dry_run` 预览模式。 +- 每个搜索词独立处理,单个词失败不阻塞其他词(部分成功模式)。 + +**关键数据结构**: +- **输入**: + - `terms` (数组, 必需): 搜索关键词列表 + - `location` (字符串): 地点过滤 + - `results_wanted` (整数): 每个词的期望结果数量 + - `hours_old` (整数): 时间窗口(小时) + - `dry_run` (布尔): 预览模式 +- **输出**: + - `run_id`: 批次运行标识符 + - `results`: 每个搜索词的详细结果 + - `totals`: 汇总统计 (`inserted_count`, `duplicate_count`, `fetched_count` 等) + +--- + +## 2. bulk_read_new_jobs (批量读取新职位) + +**工作流顺序**:第 2 步 + +从 SQLite 数据库中提取处于 `new` 状态的职位(只读操作)。 + +**具体做什么**: +- 使用基于光标 (`cursor`) 的分页,确保在大数据量下的检索效率和确定性顺序。 +- 每次请求返回最多 `limit` 条记录及分页信息。 + +**关键数据结构**: +- **输入**: + - `limit` (整数, 默认 50): 每页数量 (1-1000) + - `cursor` (字符串): 分页标记 + - `db_path` (字符串): 数据库路径覆盖 +- **输出**: + - `jobs`: 职位列表 (包含完整详情) + - `count`: 本页记录数 + - `next_cursor`: 下一页光标 + - `has_more`: 是否有更多页 + +--- + +## 3. 职位筛选与状态更新 (Triage & Status Update) + +这一步是"大脑"与"手脚"协作过程: + +### 3a. Skill 执行:AI 逻辑分类 (Triage) + +Agent 加载 `job-matching-expertise` Skill 对读取到的新职位进行打分。 + +**具体做什么**: +- **内存推理**:AI 在一次对话中读完所有职位描述,根据 Skill 里的 Matching Rubric 判定每个职位属于 `shortlist` (入围), `reviewed` (待定) 还是 `reject` (拒绝)。 +- **批量决策**:AI 在大脑中形成一份完整的"修改清单",暂存分类结果和判断理由。 + +### 3b. MCP 工具:bulk_update_job_status (执行落库) + +将 3a 的决策结果一次性原子化写入数据库。 + +**具体做什么**: +- **单次交互执行**:AI 将 3a 形成的清单塞进一个 JSON 数组,调用此工具发起更新。 +- **原子性保证**:采用 All-or-Nothing 机制,确保数据库状态 (SSOT) 的绝对一致,并记录分类理由 (`notes`)。 + +**关键数据结构**: +- **输入**: + - `updates` (数组, 必需): 更新项列表,每项包含 `id`, `status`, `notes` + - `db_path` (字符串): 数据库路径覆盖 +- **输出**: + - `updated_count`: 成功更新数量 + - `failed_count`: 失败数量 + - `results`: 每项的详细结果 + +--- + +## 4. initialize_shortlist_trackers (初始化入围追踪) + +**工作流顺序**:第 4 步 + +为入围职位于 `trackers/` 目录生成对应的 Markdown 文件与存储目录。 + +**具体做什么**: +- 读取数据库中 `status='shortlist'` 的职位。 +- 生成遵循特定命名规范的 `.md` 文件 (如 `YYYY-MM-DD-company-id.md`)。 +- 自动创建文件所需的父目录及相关的简历/封信存储文件夹。 +- **去重逻辑**:基于 `reference_link` (即职位 URL) 检测已有 tracker,避免重复创建。 + +**关键数据结构**: +- **输入**: + - `limit` (整数, 默认 50): 处理数量 (1-200) + - `force` (布尔): 覆盖模式 + - `dry_run` (布尔): 预览模式 + - `trackers_dir` (字符串): tracker 目录路径覆盖 +- **输出**: + - `created_count`: 创建数量 + - `skipped_count`: 跳过数量(已存在) + - `failed_count`: 失败数量 + - `results`: 每项的详细结果 + +--- + +## 5. 简历定制与编译 (Resume Tailoring & Compilation) + +这一步是典型的"混合式编排",工具与 Agent 需要**迭代交互**直至成功: + +### 交互模型:Try → Fail → Fix → Retry + +``` +Agent 调用 career_tailor + ↓ +[若 tex 中存在占位符] → 返回 VALIDATION_ERROR (item 失败) + ↓ +Agent 使用文件编辑工具填充占位符 + ↓ +Agent 再次调用 career_tailor + ↓ +[若无占位符] → 触发 pdflatex 编译 → 成功 +``` + +### 5a. 首次调用:Bootstrap + 占位符检测 + +调用 `career_tailor` 创建应用工作区。 + +**具体做什么**: +- **备料汇总**:批量建立目录、准备 `.tex` 模板并汇总生成的 `ai_context.md`。 +- **占位符拦截**:工具扫描 tex 文件,若检测到占位符模式则抛出 `VALIDATION_ERROR`,该 item 失败。 + +**已知的占位符模式**: +- `WORK-BULLET-POINT-*` +- `PROJECT-AI-*` +- `PROJECT-BE-*` +- `TODO: fill this in` +- `[Description goes here]` + +### 5b. Agent 端:内容创作 (Crafting) + +加载 `resume-crafting-expertise` Skill 并在内存中构思适配内容。 + +**具体做什么**: +- **内容翻译**:Agent 读取 `ai_context.md`,将你的原始经历"翻译"成专业 Bullet Points。 +- **源码注入**:Agent 使用文件编辑工具,将构思好的内容写回 `resume.tex`,必须**抹除所有占位符**。 + +### 5c. 再次调用:最终渲染 (Final Compile) + +再次调用 `career_tailor` 渲染 PDF 成果。 + +**具体做什么**: +- **门禁扫描**:工具再次扫描 `.tex` 文件。若无占位符,正式触发 `pdflatex`。 +- **成果交接**:编译成功后,生成的 PDF 路径会进入 `successful_items` 载荷,作为 `finalize_resume_batch` 的**输入契约**。 + +**关键数据结构**: +- **输入**: + - `items` (数组, 必需): 批次项,每项包含 `tracker_path` 和可选的 `job_db_id` + - `force` (布尔): 是否覆盖已有 resume.tex + - `full_resume_path`, `resume_template_path`, `applications_dir`, `pdflatex_cmd`: 可选覆盖路径 +- **输出**: + - `run_id`: 批次运行标识符 + - `total_count`, `success_count`, `failed_count`: 汇总统计 + - `results`: 每项的详细结果 + - `successful_items`: 成功项列表(供 finalize 使用) + - `warnings`: 非致命警告列表 + +--- + +## 6. finalize_resume_batch (流程批量完结) + +**工作流顺序**:第 6 步 (落库完结) + +提交定制成果到数据库并同步追踪文件状态。 + +**具体做什么**: +- **级联写入**: + 1. 更新数据库职位状态为 `resume_written` 并记录 `resume_pdf_path` + 2. 同步更新追踪文件的 Frontmatter 状态 +- **错误补偿机制**:如果 tracker 文件同步失败,数据库状态会回滚到 `reviewed` 并记录 `last_error`。 +- **per-item 继续执行**:单个 item 失败不阻塞其他 items。 + +**关键数据结构**: +- **输入**: + - `items` (数组, 必需): 包含 `id`, `tracker_path`, 可选 `resume_pdf_path` + - `run_id` (字符串): 批次标识符(可选,自动生成) + - `db_path` (字符串): 数据库路径覆盖 + - `dry_run` (布尔): 预览模式 +- **输出**: + - `run_id`: 批次运行标识符 + - `finalized_count`: 成功完结数量 + - `failed_count`: 失败数量 + - `dry_run`: 是否为预览模式 + - `results`: 每项的详细结果 + +--- + +## 7. update_tracker_status (可选维护:追踪文件状态微调) + +**工作流顺序**:实用工具/辅助步骤 + +安全更新 Markdown 追踪文件的 Frontmatter 状态。 + +**具体做什么**: +- **手动微调**:这是一个低频工具,主要用于用户手动调整状态(如从"简历已写"改为"面试中")。 +- **状态转换策略**:内置状态机,防止非法状态转换。可使用 `force=true` 绕过(会产生警告)。 +- **Resume Written 守门检查**:当状态变为 `Resume Written` 时,会强制执行**物理校验**: + - 检查 `resume.pdf` 是否存在且非空 + - 检查 `resume.tex` 是否存在 + - 扫描 tex 中是否残留占位符 + +**关键数据结构**: +- **输入**: + - `tracker_path` (字符串, 必需): tracker 文件路径 + - `target_status` (字符串, 必需): 目标状态 + - `dry_run` (布尔): 预览模式 + - `force` (布尔): 强制绕过状态转换策略 +- **输出**: + - `tracker_path`: 操作的文件路径 + - `previous_status`: 原状态 + - `target_status`: 目标状态 + - `action`: 操作结果 (`updated`, `noop`, `would_update`, `blocked`) + - `success`: 是否成功 + - `guardrail_check_passed`: 守门检查是否通过(仅限 Resume Written) + - `warnings`: 警告列表 diff --git a/docs/pipeline-prompt.md b/docs/pipeline-prompt.md deleted file mode 100644 index f2570b0..0000000 --- a/docs/pipeline-prompt.md +++ /dev/null @@ -1,63 +0,0 @@ -# JobWorkFlow Pipeline Prompt (v2) - -Use this prompt for one complete pipeline run with current implemented MCP tools. - -## Full Prompt - -```text -You are the JobWorkFlow pipeline execution agent. Run one complete workflow in repository root: -/Users/nd/Developer/JobWorkFlow - -Goal: -- Execute one end-to-end run from ingestion to tracker initialization to completion sync. -- Keep database status as SSOT. Trackers are projection only. - -Hard Rules: -1) Only use implemented MCP tools: - - scrape_jobs - - bulk_read_new_jobs - - bulk_update_job_status - - initialize_shortlist_trackers - - career_tailor - - update_tracker_status - - finalize_resume_batch -2) Never generate fake resume artifacts. If real resume files are missing or invalid, do not move to Resume Written / resume_written. -3) Continue when safe on partial failures and report per-step errors. -4) Use repo-root-relative paths only; do not write outside this repository. -5) Use project skills as policy layers: - - Start intake phase with `job-pipeline-intake` - - Start artifact/finalize phase with `career-tailor-finalize` - - Report `skills_used` and `skills_skipped` in final output - -Execution Steps: -1) Run scrape_jobs with defaults to ingest fresh jobs into DB. -2) Run bulk_read_new_jobs(limit=50) to fetch the new queue. -3) Triage and prepare updates: - - If triage policy is provided, classify to shortlist/reviewed/reject. - - If policy is missing, run dry-run style recommendation output only and do not write statuses. -4) If triage decisions exist, call bulk_update_job_status once for atomic write. -5) Run initialize_shortlist_trackers(limit=50, force=false, dry_run=false). -6) Build `career_tailor` batch input from shortlist trackers and run it once. -7) Use `career_tailor.successful_items` as input to finalize_resume_batch. -8) Leave failed/unqualified items at shortlist/reviewed and include concrete reasons. - -Output Format (required): -- skills_used: [skill_name...] -- skills_skipped: [{name, reason}] (empty array when none) -- run_id (if available) -- scrape totals: fetched / cleaned / inserted / duplicate -- triage totals: shortlist / reviewed / reject -- tracker totals: created / skipped / failed -- finalize totals: success / failed -- errors: grouped by step -- next_actions: explicit manual follow-ups -``` - -## Notes - -- Project skill files: - - `skills/job-pipeline-intake/SKILL.md` - - `skills/career-tailor-finalize/SKILL.md` -- `career_tailor` is artifact-focused and does not finalize DB/tracker statuses. -- Keep finalization as a separate explicit step via `finalize_resume_batch`. -- Update this file when tool contracts change. diff --git a/mcp-server-python/README.md b/mcp-server-python/README.md index d4776c0..9b48806 100644 --- a/mcp-server-python/README.md +++ b/mcp-server-python/README.md @@ -120,6 +120,12 @@ The server supports configuration via environment variables: |----------|-------------|---------| | `JOBWORKFLOW_ROOT` | Root directory for JobWorkFlow data | Repository root | | `JOBWORKFLOW_DB` | Database file path | `data/capture/jobs.db` | +| `JOBWORKFLOW_BULK_READ_LIMIT` | Default page size for `bulk_read_new_jobs` | `50` | +| `JOBWORKFLOW_TRACKERS_DIR` | Default trackers directory for `initialize_shortlist_trackers` | `trackers` | +| `JOBWORKFLOW_FULL_RESUME_PATH` | Default full resume path for `career_tailor` | `data/templates/full_resume.md` | +| `JOBWORKFLOW_RESUME_TEMPLATE_PATH` | Default resume template path for `career_tailor` | `data/templates/resume_skeleton.tex` | +| `JOBWORKFLOW_APPLICATIONS_DIR` | Default applications workspace root for `career_tailor` | `data/applications` | +| `JOBWORKFLOW_PDFLATEX_CMD` | Default LaTeX command for `career_tailor` | `pdflatex` | | `JOBWORKFLOW_LOG_LEVEL` | Logging level (DEBUG, INFO, WARNING, ERROR) | `INFO` | | `JOBWORKFLOW_LOG_FILE` | Log file path (enables file logging) | None (stderr only) | diff --git a/mcp-server-python/config.py b/mcp-server-python/config.py index 6069397..66a97eb 100644 --- a/mcp-server-python/config.py +++ b/mcp-server-python/config.py @@ -8,17 +8,40 @@ - Logging configuration """ -import os import logging +import os from pathlib import Path -from typing import Optional +from typing import List, Optional + +from dotenv import load_dotenv + +# Load environment variables from .env file at project root +# config.py is in mcp-server-python/, so .env is in parent directory +_env_path = Path(__file__).resolve().parent.parent / ".env" +load_dotenv(dotenv_path=_env_path) + + +def _parse_str_list(env_var: str, default: List[str]) -> List[str]: + """Parse a comma-separated string from env into a list of strings.""" + value = os.getenv(env_var) + if value is None or not value.strip(): + return default + return [item.strip() for item in value.split(",") if item.strip()] + + +def _parse_bool(env_var: str, default: bool) -> bool: + """Parse a boolean value from an environment variable.""" + value = os.getenv(env_var) + if value is None: + return default + return value.lower() in ("true", "1", "t", "y", "yes") class Config: """ Configuration class for MCP server settings. - Supports configuration via environment variables with sensible defaults. + Supports configuration via environment variables (and .env file) with sensible defaults. All paths are resolved relative to the repository root. """ @@ -37,6 +60,44 @@ def __init__(self): # Server configuration self.server_name = os.getenv("JOBWORKFLOW_SERVER_NAME", "jobworkflow-mcp-server") + # Scrape tool configuration + self.scrape_terms = _parse_str_list( + "JOBWORKFLOW_SCRAPE_TERMS", ["ai engineer", "backend engineer", "machine learning"] + ) + self.scrape_location = os.getenv("JOBWORKFLOW_SCRAPE_LOCATION", "Ontario, Canada") + self.scrape_sites = _parse_str_list("JOBWORKFLOW_SCRAPE_SITES", ["linkedin"]) + self.scrape_results_wanted = int(os.getenv("JOBWORKFLOW_SCRAPE_RESULTS_WANTED", "20")) + self.scrape_hours_old = int(os.getenv("JOBWORKFLOW_SCRAPE_HOURS_OLD", "2")) + self.scrape_require_description = _parse_bool( + "JOBWORKFLOW_SCRAPE_REQUIRE_DESCRIPTION", True + ) + self.scrape_preflight_host = os.getenv( + "JOBWORKFLOW_SCRAPE_PREFLIGHT_HOST", "www.linkedin.com" + ) + self.scrape_retry_count = int(os.getenv("JOBWORKFLOW_SCRAPE_RETRY_COUNT", "3")) + self.scrape_retry_sleep_seconds = float( + os.getenv("JOBWORKFLOW_SCRAPE_RETRY_SLEEP_SECONDS", "30") + ) + self.scrape_retry_backoff = float(os.getenv("JOBWORKFLOW_SCRAPE_RETRY_BACKOFF", "2")) + self.scrape_save_capture_json = _parse_bool("JOBWORKFLOW_SCRAPE_SAVE_CAPTURE_JSON", True) + self.scrape_capture_dir = os.getenv("JOBWORKFLOW_SCRAPE_CAPTURE_DIR", "data/capture") + + # bulk_read_new_jobs defaults + self.bulk_read_limit = int(os.getenv("JOBWORKFLOW_BULK_READ_LIMIT", "50")) + + # initialize_shortlist_trackers defaults + self.trackers_dir = os.getenv("JOBWORKFLOW_TRACKERS_DIR", "trackers") + + # career_tailor defaults + self.full_resume_path = os.getenv( + "JOBWORKFLOW_FULL_RESUME_PATH", "data/templates/full_resume.md" + ) + self.resume_template_path = os.getenv( + "JOBWORKFLOW_RESUME_TEMPLATE_PATH", "data/templates/resume_skeleton.tex" + ) + self.applications_dir = os.getenv("JOBWORKFLOW_APPLICATIONS_DIR", "data/applications") + self.pdflatex_cmd = os.getenv("JOBWORKFLOW_PDFLATEX_CMD", "pdflatex") + def _find_repo_root(self) -> Path: """ Find the repository root directory. diff --git a/mcp-server-python/db/jobs_ingest_writer.py b/mcp-server-python/db/jobs_ingest_writer.py index d32434c..8a048bb 100644 --- a/mcp-server-python/db/jobs_ingest_writer.py +++ b/mcp-server-python/db/jobs_ingest_writer.py @@ -6,18 +6,12 @@ """ import sqlite3 -import os from pathlib import Path -from typing import Optional, Dict, Any, Tuple +from typing import Any, Dict, Optional, Tuple from models.errors import create_db_error, create_validation_error - - -# Default database path relative to repository root -DEFAULT_DB_PATH = "data/capture/jobs.db" - -# Allowed status values for job records (Requirement 8.2) -ALLOWED_STATUSES = {"new", "shortlist", "reviewed", "reject", "resume_written", "applied"} +from models.status import JobDbStatus +from utils.path_resolution import resolve_db_path as resolve_db_path_shared def resolve_db_path(db_path: Optional[str] = None) -> Path: @@ -36,32 +30,7 @@ def resolve_db_path(db_path: Optional[str] = None) -> Path: Returns: Resolved absolute Path to the database """ - # Use provided path first - if db_path is not None: - path_str = db_path - else: - # Then explicit env override - db_env = os.getenv("JOBWORKFLOW_DB") - if db_env: - path_str = db_env - else: - # Then JOBWORKFLOW_ROOT fallback - root_env = os.getenv("JOBWORKFLOW_ROOT") - if root_env: - return Path(root_env) / "data" / "capture" / "jobs.db" - # Final default - path_str = DEFAULT_DB_PATH - - path = Path(path_str) - - # If relative, resolve from repository root - if not path.is_absolute(): - # Find repository root (parent of mcp-server-python directory) - current_file = Path(__file__).resolve() - repo_root = current_file.parents[2] # db/ -> mcp-server-python/ -> repo/ - path = repo_root / path - - return path + return resolve_db_path_shared(db_path) def ensure_parent_dirs(db_path: Path) -> None: @@ -102,7 +71,8 @@ def bootstrap_schema(conn: sqlite3.Connection) -> None: """ try: # Create jobs table if it doesn't exist - conn.execute(""" + conn.execute( + """ CREATE TABLE IF NOT EXISTS jobs ( id INTEGER PRIMARY KEY AUTOINCREMENT, job_id TEXT, @@ -112,7 +82,9 @@ def bootstrap_schema(conn: sqlite3.Connection) -> None: url TEXT NOT NULL UNIQUE, location TEXT, source TEXT, - status TEXT NOT NULL DEFAULT 'new', + status TEXT NOT NULL DEFAULT '""" + + JobDbStatus.NEW.value + + """', captured_at TEXT, payload_json TEXT NOT NULL, created_at TEXT NOT NULL, @@ -123,7 +95,8 @@ def bootstrap_schema(conn: sqlite3.Connection) -> None: attempt_count INTEGER DEFAULT 0, last_error TEXT ) - """) + """ + ) # Create status index if it doesn't exist conn.execute(""" @@ -229,7 +202,9 @@ def __exit__(self, exc_type, exc_val, exc_tb): return False def insert_cleaned_records( - self, records: list[Dict[str, Any]], status: str = "new" + self, + records: list[Dict[str, Any]], + status: str = JobDbStatus.NEW, ) -> Tuple[int, int]: """ Insert cleaned records with deduplication by URL. @@ -272,8 +247,10 @@ def insert_cleaned_records( f"Invalid status: '{status}' contains leading or trailing whitespace" ) - if status not in ALLOWED_STATUSES: - allowed_list = ", ".join(sorted(ALLOWED_STATUSES)) + try: + JobDbStatus(status) + except ValueError: + allowed_list = ", ".join(sorted(s.value for s in JobDbStatus)) raise create_validation_error( f"Invalid status value: '{status}'. Allowed values are: {allowed_list}" ) diff --git a/mcp-server-python/db/jobs_reader.py b/mcp-server-python/db/jobs_reader.py index ca1f62e..9eab229 100644 --- a/mcp-server-python/db/jobs_reader.py +++ b/mcp-server-python/db/jobs_reader.py @@ -6,19 +6,16 @@ """ import sqlite3 -import os -from pathlib import Path -from typing import Optional, List, Dict, Any, Tuple from contextlib import contextmanager +from pathlib import Path +from typing import Any, Dict, List, Optional, Tuple from models.errors import ( - create_db_not_found_error, create_db_error, + create_db_not_found_error, ) - - -# Default database path relative to repository root -DEFAULT_DB_PATH = "data/capture/jobs.db" +from models.status import JobDbStatus +from utils.path_resolution import resolve_db_path as resolve_db_path_shared def resolve_db_path(db_path: Optional[str] = None) -> Path: @@ -37,32 +34,7 @@ def resolve_db_path(db_path: Optional[str] = None) -> Path: Returns: Resolved absolute Path to the database """ - # Use provided path first - if db_path is not None: - path_str = db_path - else: - # Then explicit env override - db_env = os.getenv("JOBWORKFLOW_DB") - if db_env: - path_str = db_env - else: - # Then JOBWORKFLOW_ROOT fallback - root_env = os.getenv("JOBWORKFLOW_ROOT") - if root_env: - return Path(root_env) / "data" / "capture" / "jobs.db" - # Final default - path_str = DEFAULT_DB_PATH - - path = Path(path_str) - - # If relative, resolve from repository root - if not path.is_absolute(): - # Find repository root (parent of mcp-server-python directory) - current_file = Path(__file__).resolve() - repo_root = current_file.parents[2] # db/ -> mcp-server-python/ -> repo/ - path = repo_root / path - - return path + return resolve_db_path_shared(db_path) @contextmanager @@ -160,11 +132,11 @@ def query_new_jobs( status, captured_at FROM jobs - WHERE status = 'new' + WHERE status = ? ORDER BY captured_at DESC, id DESC LIMIT ? """ - params = (limit + 1,) + params = (JobDbStatus.NEW, limit + 1) else: # Subsequent page - apply cursor boundary cursor_ts, cursor_id = cursor @@ -181,7 +153,7 @@ def query_new_jobs( status, captured_at FROM jobs - WHERE status = 'new' + WHERE status = ? AND ( captured_at < ? OR (captured_at = ? AND id < ?) @@ -189,7 +161,7 @@ def query_new_jobs( ORDER BY captured_at DESC, id DESC LIMIT ? """ - params = (cursor_ts, cursor_ts, cursor_id, limit + 1) + params = (JobDbStatus.NEW, cursor_ts, cursor_ts, cursor_id, limit + 1) # Execute query cursor_obj = conn.execute(query, params) @@ -262,11 +234,11 @@ def query_shortlist_jobs(conn: sqlite3.Connection, limit: int) -> List[Dict[str, captured_at, status FROM jobs - WHERE status = 'shortlist' + WHERE status = ? ORDER BY captured_at DESC, id DESC LIMIT ? """ - params = (limit,) + params = (JobDbStatus.SHORTLIST, limit) # Execute query cursor_obj = conn.execute(query, params) diff --git a/mcp-server-python/db/jobs_writer.py b/mcp-server-python/db/jobs_writer.py index ac5e1f0..a121dbd 100644 --- a/mcp-server-python/db/jobs_writer.py +++ b/mcp-server-python/db/jobs_writer.py @@ -6,18 +6,15 @@ """ import sqlite3 -import os from pathlib import Path -from typing import Optional, List +from typing import List, Optional from models.errors import ( - create_db_not_found_error, create_db_error, + create_db_not_found_error, ) - - -# Default database path relative to repository root -DEFAULT_DB_PATH = "data/capture/jobs.db" +from models.status import JobDbStatus +from utils.path_resolution import resolve_db_path as resolve_db_path_shared def resolve_db_path(db_path: Optional[str] = None) -> Path: @@ -36,32 +33,7 @@ def resolve_db_path(db_path: Optional[str] = None) -> Path: Returns: Resolved absolute Path to the database """ - # Use provided path first - if db_path is not None: - path_str = db_path - else: - # Then explicit env override - db_env = os.getenv("JOBWORKFLOW_DB") - if db_env: - path_str = db_env - else: - # Then JOBWORKFLOW_ROOT fallback - root_env = os.getenv("JOBWORKFLOW_ROOT") - if root_env: - return Path(root_env) / "data" / "capture" / "jobs.db" - # Final default - path_str = DEFAULT_DB_PATH - - path = Path(path_str) - - # If relative, resolve from repository root - if not path.is_absolute(): - # Find repository root (parent of mcp-server-python directory) - current_file = Path(__file__).resolve() - repo_root = current_file.parents[2] # db/ -> mcp-server-python/ -> repo/ - path = repo_root / path - - return path + return resolve_db_path_shared(db_path) class JobsWriter: @@ -342,7 +314,7 @@ def finalize_resume_written( # Execute parameterized UPDATE with all finalization fields query = """ UPDATE jobs - SET status = 'resume_written', + SET status = ?, resume_pdf_path = ?, resume_written_at = ?, run_id = ?, @@ -352,7 +324,8 @@ def finalize_resume_written( WHERE id = ? """ cursor = self.conn.execute( - query, (resume_pdf_path, timestamp, run_id, timestamp, job_id) + query, + (JobDbStatus.RESUME_WRITTEN, resume_pdf_path, timestamp, run_id, timestamp, job_id), ) # Missing target job is a per-item finalization failure. @@ -394,12 +367,12 @@ def fallback_to_reviewed(self, job_id: int, last_error: str, timestamp: str) -> # Execute parameterized UPDATE for fallback compensation query = """ UPDATE jobs - SET status = 'reviewed', + SET status = ?, last_error = ?, updated_at = ? WHERE id = ? """ - cursor = self.conn.execute(query, (last_error, timestamp, job_id)) + cursor = self.conn.execute(query, (JobDbStatus.REVIEWED, last_error, timestamp, job_id)) if cursor.rowcount == 0: raise create_db_error( diff --git a/mcp-server-python/models/job.py b/mcp-server-python/models/job.py index 56b4dd5..53c50e7 100644 --- a/mcp-server-python/models/job.py +++ b/mcp-server-python/models/job.py @@ -1,11 +1,16 @@ """ Job schema mapping for bulk_read_new_jobs MCP tool. -Provides functions to map database rows to the stable output schema -with consistent handling of missing values and JSON serialization. +Provides functions to map database rows to the stable output schema. + +Note: The core mapping logic now lives in ``schemas.bulk_read_new_jobs.JobRecord``. +This module keeps the ``to_job_schema`` helper for backward compatibility and as a +convenient dict-in / dict-out shortcut. """ -from typing import Dict, Any, Optional +from typing import Any, Dict + +from schemas.bulk_read_new_jobs import JobRecord def to_job_schema(row: Dict[str, Any]) -> Dict[str, Any]: @@ -15,20 +20,12 @@ def to_job_schema(row: Dict[str, Any]) -> Dict[str, Any]: This function ensures: - Only fixed schema fields are included in output - Missing values are handled consistently (as None) + - Empty strings are normalised to None - All values are JSON-serializable - Schema stability across all responses - Fixed schema fields: - - id: integer - - job_id: string - - title: string - - company: string - - description: string - - url: string - - location: string - - source: string - - status: string - - captured_at: string (ISO 8601 timestamp) + Delegates to ``JobRecord.model_validate`` which enforces extra='ignore' + and empty-string-to-None normalisation via a model validator. Args: row: Database row as dictionary @@ -43,37 +40,4 @@ def to_job_schema(row: Dict[str, Any]) -> Dict[str, Any]: - 3.4: Ensure JSON serializability - 3.5: Maintain stable schema contract """ - # Map to fixed schema with explicit field selection - # This ensures no arbitrary database columns leak into the output - return { - "id": _get_field(row, "id", None), - "job_id": _get_field(row, "job_id", None), - "title": _get_field(row, "title", None), - "company": _get_field(row, "company", None), - "description": _get_field(row, "description", None), - "url": _get_field(row, "url", None), - "location": _get_field(row, "location", None), - "source": _get_field(row, "source", None), - "status": _get_field(row, "status", None), - "captured_at": _get_field(row, "captured_at", None), - } - - -def _get_field(row: Dict[str, Any], field: str, default: Optional[Any] = None) -> Any: - """ - Safely extract a field from a row with consistent default handling. - - Args: - row: Database row dictionary - field: Field name to extract - default: Default value if field is missing or None - - Returns: - Field value or default - """ - value = row.get(field, default) - # Return None for empty strings to maintain consistency - # This ensures missing values are represented uniformly - if value == "": - return None - return value + return JobRecord.model_validate(row).model_dump() diff --git a/mcp-server-python/models/status.py b/mcp-server-python/models/status.py new file mode 100644 index 0000000..7e175aa --- /dev/null +++ b/mcp-server-python/models/status.py @@ -0,0 +1,50 @@ +""" +Centralized, type-safe status definitions for the JobWorkFlow pipeline. + +This module is the single source of truth for all status values used across +the application. It defines two distinct Enum classes: + +- ``JobDbStatus``: Lowercase statuses stored in the ``jobs`` SQLite table. +- ``JobTrackerStatus``: Capitalized statuses used in Markdown tracker + frontmatter (a board-friendly projection of DB milestones). + +Both Enums inherit from ``(str, Enum)`` so that members are directly +comparable to plain strings and serialize naturally to JSON at API +boundaries, preserving the external contract. +""" + +from enum import Enum + + +class JobDbStatus(str, Enum): + """Enum for statuses used in the 'jobs' database table. + + Canonical transitions (see steering.md §3): + new -> shortlist | reviewed | reject + shortlist -> resume_written (after successful finalize) + shortlist -> reviewed (on failure, with last_error) + resume_written -> applied (manual or later automation) + """ + + NEW = "new" + SHORTLIST = "shortlist" + REVIEWED = "reviewed" + REJECT = "reject" + RESUME_WRITTEN = "resume_written" + APPLIED = "applied" + + +class JobTrackerStatus(str, Enum): + """Enum for statuses used in the frontmatter of Markdown tracker files. + + Tracker statuses are projections of database milestones and must not + become a competing source of truth. + """ + + REVIEWED = "Reviewed" + RESUME_WRITTEN = "Resume Written" + APPLIED = "Applied" + INTERVIEW = "Interview" + OFFER = "Offer" + REJECTED = "Rejected" + GHOSTED = "Ghosted" diff --git a/mcp-server-python/schemas/bulk_read_new_jobs.py b/mcp-server-python/schemas/bulk_read_new_jobs.py index 21c6bd8..dcfb53f 100644 --- a/mcp-server-python/schemas/bulk_read_new_jobs.py +++ b/mcp-server-python/schemas/bulk_read_new_jobs.py @@ -3,10 +3,9 @@ from __future__ import annotations import re -from typing import Optional - -from pydantic import field_validator, model_validator +from typing import Any, Optional +from pydantic import ConfigDict, field_validator, model_validator from schemas.common import DbPathMixin, StrictIgnoreRequest, StrictResponse from utils.validation import DEFAULT_LIMIT, MAX_LIMIT, MIN_LIMIT @@ -16,15 +15,21 @@ class BulkReadNewJobsRequest(DbPathMixin, StrictIgnoreRequest): """Request schema for bulk_read_new_jobs.""" - limit: Optional[int] = None + limit: int = DEFAULT_LIMIT cursor: Optional[str] = None - @field_validator("limit") + @field_validator("limit", mode="before") @classmethod - def validate_limit(cls, value: Optional[int]) -> Optional[int]: - """Validate limit range if provided.""" + def coerce_limit_none(cls, value: Any) -> Any: + """Treat explicit ``None`` (or missing) as 'use the default'.""" if value is None: - return None + return DEFAULT_LIMIT + return value + + @field_validator("limit") + @classmethod + def validate_limit(cls, value: int) -> int: + """Validate limit range.""" if value < MIN_LIMIT: raise ValueError(f"Invalid limit: {value} is below minimum of {MIN_LIMIT}") if value > MAX_LIMIT: @@ -43,16 +48,15 @@ def validate_cursor(cls, value: Optional[str]) -> Optional[str]: raise ValueError("Invalid cursor format: must be a valid base64 string") return value - @model_validator(mode="after") - def apply_defaults(self) -> "BulkReadNewJobsRequest": - """Apply tool defaults for omitted optional fields.""" - if self.limit is None: - self.limit = DEFAULT_LIMIT - return self - class JobRecord(StrictResponse): - """Job record schema returned by bulk_read_new_jobs.""" + """Job record schema returned by bulk_read_new_jobs. + + Accepts raw database rows: extra columns are silently ignored + and empty strings are normalised to None for consistency. + """ + + model_config = ConfigDict(extra="ignore") id: int job_id: Optional[str] = None @@ -65,6 +69,14 @@ class JobRecord(StrictResponse): status: Optional[str] = None captured_at: Optional[str] = None + @model_validator(mode="before") + @classmethod + def empty_strings_to_none(cls, data: Any) -> Any: + """Convert empty-string values to None for optional fields.""" + if isinstance(data, dict): + return {k: (None if v == "" else v) for k, v in data.items()} + return data + class BulkReadNewJobsResponse(StrictResponse): """Success response schema for bulk_read_new_jobs.""" diff --git a/mcp-server-python/schemas/ingestion.py b/mcp-server-python/schemas/ingestion.py new file mode 100644 index 0000000..927b7e3 --- /dev/null +++ b/mcp-server-python/schemas/ingestion.py @@ -0,0 +1,11 @@ +""" +Ingestion schemas – re-exports canonical status enum from models.status. + +The authoritative definition lives in ``models.status.JobDbStatus``. +``JobStatus`` is kept as a convenience alias so that any existing +imports (``from schemas.ingestion import JobStatus``) continue to work. +""" + +from models.status import JobDbStatus as JobStatus + +__all__ = ["JobStatus"] diff --git a/mcp-server-python/tests/test_bulk_read_new_jobs.py b/mcp-server-python/tests/test_bulk_read_new_jobs.py index e29f45f..efb9795 100644 --- a/mcp-server-python/tests/test_bulk_read_new_jobs.py +++ b/mcp-server-python/tests/test_bulk_read_new_jobs.py @@ -6,12 +6,12 @@ """ import sqlite3 -from pathlib import Path from datetime import datetime, timezone +from pathlib import Path - -from tools.bulk_read_new_jobs import bulk_read_new_jobs from models.errors import ErrorCode +from models.status import JobDbStatus +from tools.bulk_read_new_jobs import bulk_read_new_jobs class TestBulkReadNewJobsIntegration: @@ -53,7 +53,7 @@ def create_test_db(self, db_path: Path, jobs: list): job["url"], # Required job.get("location", ""), job.get("source", ""), - job.get("status", "new"), + job.get("status", JobDbStatus.NEW), job.get("captured_at", datetime.now(timezone.utc).isoformat()), "{}", # payload_json datetime.now(timezone.utc).isoformat(), @@ -192,7 +192,7 @@ def test_tool_returns_empty_result_when_no_new_jobs(self, tmp_path): # Create database with only non-new jobs db_path = tmp_path / "test.db" jobs = [ - {"url": "http://example.com/1", "status": "applied"}, + {"url": "http://example.com/1", "status": JobDbStatus.APPLIED}, {"url": "http://example.com/2", "status": "rejected"}, ] self.create_test_db(db_path, jobs) @@ -251,7 +251,7 @@ def test_tool_returns_stable_schema(self, tmp_path): assert job["url"] == "http://example.com/1" assert job["location"] == "Toronto, ON" assert job["source"] == "linkedin" - assert job["status"] == "new" + assert job["status"] == JobDbStatus.NEW def test_tool_handles_missing_fields(self, tmp_path): """Test tool handles missing/null fields correctly.""" @@ -515,7 +515,7 @@ def create_test_db(self, db_path: Path, jobs: list): job["url"], # Required job.get("location", ""), job.get("source", ""), - job.get("status", "new"), + job.get("status", JobDbStatus.NEW), job.get("captured_at", datetime.now(timezone.utc).isoformat()), "{}", # payload_json datetime.now(timezone.utc).isoformat(), @@ -548,7 +548,7 @@ def test_tool_does_not_modify_database_rows(self, tmp_path): "description": f"Original Description {i}", "location": f"Original Location {i}", "source": "linkedin", - "status": "new", + "status": JobDbStatus.NEW, "captured_at": base_time.replace(hour=10 + i).isoformat(), "updated_at": base_time.isoformat(), } @@ -604,7 +604,7 @@ def test_tool_does_not_update_status_field(self, tmp_path): # Create jobs with status='new' jobs = [ - {"url": f"http://example.com/{i}", "title": f"Job {i}", "status": "new"} + {"url": f"http://example.com/{i}", "title": f"Job {i}", "status": JobDbStatus.NEW} for i in range(5) ] self.create_test_db(db_path, jobs) @@ -620,7 +620,7 @@ def test_tool_does_not_update_status_field(self, tmp_path): statuses = [row[0] for row in cursor.fetchall()] conn.close() - assert all(status == "new" for status in statuses) + assert all(status == JobDbStatus.NEW for status in statuses) assert len(statuses) == 5 def test_tool_does_not_write_tracker_files(self, tmp_path): @@ -852,7 +852,7 @@ def test_tool_read_only_with_pagination(self, tmp_path): { "url": f"http://example.com/{i}", "title": f"Job {i}", - "status": "new", + "status": JobDbStatus.NEW, "captured_at": base_time.replace(hour=10 + i % 10, minute=i).isoformat(), } for i in range(20) diff --git a/mcp-server-python/tests/test_bulk_update_job_status.py b/mcp-server-python/tests/test_bulk_update_job_status.py index 6705ace..e53626e 100644 --- a/mcp-server-python/tests/test_bulk_update_job_status.py +++ b/mcp-server-python/tests/test_bulk_update_job_status.py @@ -5,13 +5,14 @@ transaction management, and response formatting. """ -import pytest +import os import sqlite3 import tempfile -import os -from tools.bulk_update_job_status import bulk_update_job_status +import pytest from models.errors import ErrorCode +from models.status import JobDbStatus +from tools.bulk_update_job_status import bulk_update_job_status @pytest.fixture @@ -114,7 +115,7 @@ def test_empty_batch_returns_success(self, temp_db): def test_single_update_succeeds(self, temp_db): """Test updating a single job status.""" result = bulk_update_job_status( - {"updates": [{"id": 1, "status": "shortlist"}], "db_path": temp_db} + {"updates": [{"id": 1, "status": JobDbStatus.SHORTLIST}], "db_path": temp_db} ) assert result["updated_count"] == 1 @@ -129,16 +130,16 @@ def test_single_update_succeeds(self, temp_db): row = cursor.fetchone() conn.close() - assert row[0] == "shortlist" + assert row[0] == JobDbStatus.SHORTLIST def test_multiple_updates_succeed(self, temp_db): """Test updating multiple jobs in one batch.""" result = bulk_update_job_status( { "updates": [ - {"id": 1, "status": "shortlist"}, - {"id": 2, "status": "reviewed"}, - {"id": 3, "status": "reject"}, + {"id": 1, "status": JobDbStatus.SHORTLIST}, + {"id": 2, "status": JobDbStatus.REVIEWED}, + {"id": 3, "status": JobDbStatus.REJECT}, ], "db_path": temp_db, } @@ -159,16 +160,16 @@ def test_multiple_updates_succeed(self, temp_db): rows = cursor.fetchall() conn.close() - assert rows[0]["status"] == "shortlist" - assert rows[1]["status"] == "reviewed" - assert rows[2]["status"] == "reject" + assert rows[0]["status"] == JobDbStatus.SHORTLIST + assert rows[1]["status"] == JobDbStatus.REVIEWED + assert rows[2]["status"] == JobDbStatus.REJECT def test_idempotent_update_succeeds(self, temp_db): """Test updating a job to its current status (idempotent).""" result = bulk_update_job_status( { "updates": [ - {"id": 1, "status": "new"} # Job 1 already has status 'new' + {"id": 1, "status": JobDbStatus.NEW} # Job 1 already has status 'new' ], "db_path": temp_db, } @@ -185,12 +186,18 @@ def test_idempotent_update_succeeds(self, temp_db): row = cursor.fetchone() conn.close() - assert row["status"] == "new" + assert row["status"] == JobDbStatus.NEW assert row["updated_at"] is not None def test_all_valid_statuses(self, temp_db): """Test all valid status values.""" - valid_statuses = ["new", "shortlist", "reviewed", "reject", "resume_written"] + valid_statuses = [ + JobDbStatus.NEW, + JobDbStatus.SHORTLIST, + JobDbStatus.REVIEWED, + JobDbStatus.REJECT, + JobDbStatus.RESUME_WRITTEN, + ] for i, status in enumerate(valid_statuses, start=1): result = bulk_update_job_status( @@ -222,7 +229,7 @@ def test_updates_not_a_list(self, temp_db): def test_batch_size_too_large(self, temp_db): """Test error when batch size exceeds 100.""" - updates = [{"id": i, "status": "new"} for i in range(1, 102)] + updates = [{"id": i, "status": JobDbStatus.NEW} for i in range(1, 102)] result = bulk_update_job_status({"updates": updates, "db_path": temp_db}) @@ -235,9 +242,9 @@ def test_duplicate_job_ids(self, temp_db): result = bulk_update_job_status( { "updates": [ - {"id": 1, "status": "shortlist"}, - {"id": 2, "status": "reviewed"}, - {"id": 1, "status": "reject"}, # Duplicate ID + {"id": 1, "status": JobDbStatus.SHORTLIST}, + {"id": 2, "status": JobDbStatus.REVIEWED}, + {"id": 1, "status": JobDbStatus.REJECT}, # Duplicate ID ], "db_path": temp_db, } @@ -252,10 +259,10 @@ def test_duplicate_job_ids_mixed_types_return_validation_error(self, temp_db): result = bulk_update_job_status( { "updates": [ - {"id": "abc", "status": "new"}, - {"id": "abc", "status": "shortlist"}, - {"id": 1, "status": "reviewed"}, - {"id": 1, "status": "reject"}, + {"id": "abc", "status": JobDbStatus.NEW}, + {"id": "abc", "status": JobDbStatus.SHORTLIST}, + {"id": 1, "status": JobDbStatus.REVIEWED}, + {"id": 1, "status": JobDbStatus.REJECT}, ], "db_path": temp_db, } @@ -272,7 +279,7 @@ class TestPerItemFailures: def test_nonexistent_job_id(self, temp_db): """Test failure when job ID doesn't exist.""" result = bulk_update_job_status( - {"updates": [{"id": 999, "status": "shortlist"}], "db_path": temp_db} + {"updates": [{"id": 999, "status": JobDbStatus.SHORTLIST}], "db_path": temp_db} ) assert result["updated_count"] == 0 @@ -296,7 +303,7 @@ def test_invalid_status_value(self, temp_db): def test_invalid_job_id_type(self, temp_db): """Test failure when job ID is not an integer.""" result = bulk_update_job_status( - {"updates": [{"id": "not_an_int", "status": "shortlist"}], "db_path": temp_db} + {"updates": [{"id": "not_an_int", "status": JobDbStatus.SHORTLIST}], "db_path": temp_db} ) assert result["updated_count"] == 0 @@ -307,7 +314,7 @@ def test_invalid_job_id_type(self, temp_db): def test_negative_job_id(self, temp_db): """Test failure when job ID is negative.""" result = bulk_update_job_status( - {"updates": [{"id": -1, "status": "shortlist"}], "db_path": temp_db} + {"updates": [{"id": -1, "status": JobDbStatus.SHORTLIST}], "db_path": temp_db} ) assert result["updated_count"] == 0 @@ -320,7 +327,7 @@ def test_missing_id_field(self, temp_db): result = bulk_update_job_status( { "updates": [ - {"status": "shortlist"} # Missing 'id' + {"status": JobDbStatus.SHORTLIST} # Missing 'id' ], "db_path": temp_db, } @@ -363,9 +370,9 @@ def test_mixed_valid_and_invalid_updates(self, temp_db): result = bulk_update_job_status( { "updates": [ - {"id": 1, "status": "shortlist"}, # Valid - {"id": 999, "status": "reviewed"}, # Invalid (doesn't exist) - {"id": 3, "status": "reject"}, # Valid + {"id": 1, "status": JobDbStatus.SHORTLIST}, # Valid + {"id": 999, "status": JobDbStatus.REVIEWED}, # Invalid (doesn't exist) + {"id": 3, "status": JobDbStatus.REJECT}, # Valid ], "db_path": temp_db, } @@ -382,7 +389,7 @@ def test_mixed_valid_and_invalid_updates(self, temp_db): conn.close() # Both should still be 'new' (original value) - assert all(row[0] == "new" for row in rows) + assert all(row[0] == JobDbStatus.NEW for row in rows) class TestDatabaseErrors: @@ -391,7 +398,10 @@ class TestDatabaseErrors: def test_database_not_found(self): """Test error when database file doesn't exist.""" result = bulk_update_job_status( - {"updates": [{"id": 1, "status": "shortlist"}], "db_path": "/nonexistent/path/to/db.db"} + { + "updates": [{"id": 1, "status": JobDbStatus.SHORTLIST}], + "db_path": "/nonexistent/path/to/db.db", + } ) assert "error" in result @@ -401,7 +411,10 @@ def test_database_not_found(self): def test_missing_updated_at_column(self, temp_db_no_updated_at): """Test error when updated_at column is missing.""" result = bulk_update_job_status( - {"updates": [{"id": 1, "status": "shortlist"}], "db_path": temp_db_no_updated_at} + { + "updates": [{"id": 1, "status": JobDbStatus.SHORTLIST}], + "db_path": temp_db_no_updated_at, + } ) assert "error" in result @@ -416,7 +429,7 @@ class TestTimestampBehavior: def test_timestamp_is_set(self, temp_db): """Test that updated_at timestamp is set.""" result = bulk_update_job_status( - {"updates": [{"id": 1, "status": "shortlist"}], "db_path": temp_db} + {"updates": [{"id": 1, "status": JobDbStatus.SHORTLIST}], "db_path": temp_db} ) assert result["updated_count"] == 1 @@ -438,9 +451,9 @@ def test_same_timestamp_for_batch(self, temp_db): result = bulk_update_job_status( { "updates": [ - {"id": 1, "status": "shortlist"}, - {"id": 2, "status": "reviewed"}, - {"id": 3, "status": "reject"}, + {"id": 1, "status": JobDbStatus.SHORTLIST}, + {"id": 2, "status": JobDbStatus.REVIEWED}, + {"id": 3, "status": JobDbStatus.REJECT}, ], "db_path": temp_db, } @@ -466,9 +479,9 @@ def test_result_ordering_matches_input(self, temp_db): result = bulk_update_job_status( { "updates": [ - {"id": 3, "status": "shortlist"}, - {"id": 1, "status": "reviewed"}, - {"id": 2, "status": "reject"}, + {"id": 3, "status": JobDbStatus.SHORTLIST}, + {"id": 1, "status": JobDbStatus.REVIEWED}, + {"id": 2, "status": JobDbStatus.REJECT}, ], "db_path": temp_db, } @@ -482,7 +495,7 @@ def test_result_ordering_matches_input(self, temp_db): def test_response_has_required_fields(self, temp_db): """Test that response has all required fields.""" result = bulk_update_job_status( - {"updates": [{"id": 1, "status": "shortlist"}], "db_path": temp_db} + {"updates": [{"id": 1, "status": JobDbStatus.SHORTLIST}], "db_path": temp_db} ) # Check top-level fields @@ -513,8 +526,8 @@ def test_rollback_on_validation_failure(self, temp_db): result = bulk_update_job_status( { "updates": [ - {"id": 1, "status": "shortlist"}, # Valid - {"id": 999, "status": "reviewed"}, # Invalid (doesn't exist) + {"id": 1, "status": JobDbStatus.SHORTLIST}, # Valid + {"id": 999, "status": JobDbStatus.REVIEWED}, # Invalid (doesn't exist) ], "db_path": temp_db, } diff --git a/mcp-server-python/tests/test_bulk_update_job_status_schemas.py b/mcp-server-python/tests/test_bulk_update_job_status_schemas.py index 391a43f..d49eae2 100644 --- a/mcp-server-python/tests/test_bulk_update_job_status_schemas.py +++ b/mcp-server-python/tests/test_bulk_update_job_status_schemas.py @@ -1,9 +1,9 @@ """Unit tests for bulk_update_job_status Pydantic schemas.""" import pytest -from pydantic import ValidationError - from models.errors import ErrorCode +from models.status import JobDbStatus +from pydantic import ValidationError from schemas.bulk_update_job_status import BulkUpdateJobStatusRequest from utils.pydantic_error_mapper import map_pydantic_validation_error @@ -19,7 +19,7 @@ def test_missing_updates_maps_to_validation_error(self): def test_extra_fields_ignored_for_compatibility(self): model = BulkUpdateJobStatusRequest.model_validate( - {"updates": [{"id": 1, "status": "new"}], "unknown": "ignored"} + {"updates": [{"id": 1, "status": JobDbStatus.NEW}], "unknown": "ignored"} ) assert len(model.updates) == 1 diff --git a/mcp-server-python/tests/test_career_tailor.py b/mcp-server-python/tests/test_career_tailor.py index 6e03c19..6d1298b 100644 --- a/mcp-server-python/tests/test_career_tailor.py +++ b/mcp-server-python/tests/test_career_tailor.py @@ -5,17 +5,19 @@ including item processing, error handling, and successful_items generation. """ -import pytest import sqlite3 from unittest.mock import patch + +import pytest +from models.errors import ErrorCode, ToolError +from models.status import JobDbStatus from tools.career_tailor import ( career_tailor, generate_run_id, - sanitize_error_message, - resolve_job_db_id, process_item_tailoring, + resolve_job_db_id, + sanitize_error_message, ) -from models.errors import ToolError, ErrorCode class TestGenerateRunId: @@ -474,7 +476,7 @@ def test_boundary_no_db_or_tracker_status_mutation( db_path = tmp_path / "jobs.db" conn = sqlite3.connect(str(db_path)) conn.execute("CREATE TABLE jobs (id INTEGER PRIMARY KEY, status TEXT NOT NULL)") - conn.execute("INSERT INTO jobs (id, status) VALUES (1, 'shortlist')") + conn.execute("INSERT INTO jobs (id, status) VALUES (1, ?)", (JobDbStatus.SHORTLIST,)) conn.commit() conn.close() @@ -531,4 +533,4 @@ def test_boundary_no_db_or_tracker_status_mutation( conn = sqlite3.connect(str(db_path)) row = conn.execute("SELECT status FROM jobs WHERE id = 1").fetchone() conn.close() - assert row[0] == "shortlist" + assert row[0] == JobDbStatus.SHORTLIST diff --git a/mcp-server-python/tests/test_checkpoint_task4.py b/mcp-server-python/tests/test_checkpoint_task4.py index 3c35bc0..3d0421a 100644 --- a/mcp-server-python/tests/test_checkpoint_task4.py +++ b/mcp-server-python/tests/test_checkpoint_task4.py @@ -8,19 +8,20 @@ 4. Database connections are properly closed """ -import pytest +import os import sqlite3 import tempfile -import os +import pytest from db.jobs_writer import JobsWriter -from models.errors import ToolError, ErrorCode +from models.errors import ErrorCode, ToolError +from models.status import JobDbStatus from utils.validation import ( - validate_status, - validate_job_id, + get_current_utc_timestamp, validate_batch_size, + validate_job_id, + validate_status, validate_unique_job_ids, - get_current_utc_timestamp, ) @@ -67,8 +68,8 @@ class TestCheckpointTask4: def test_validation_functions_work_correctly(self): """Verify all validation functions work as expected.""" # Test status validation - assert validate_status("new") == "new" - assert validate_status("shortlist") == "shortlist" + assert validate_status(JobDbStatus.NEW) == JobDbStatus.NEW + assert validate_status(JobDbStatus.SHORTLIST) == JobDbStatus.SHORTLIST with pytest.raises(ToolError) as exc_info: validate_status("invalid") @@ -84,7 +85,7 @@ def test_validation_functions_work_correctly(self): # Test batch size validation validate_batch_size([]) # Should not raise - validate_batch_size([{"id": 1, "status": "new"}]) # Should not raise + validate_batch_size([{"id": 1, "status": JobDbStatus.NEW}]) # Should not raise with pytest.raises(ToolError) as exc_info: validate_batch_size([{"id": i} for i in range(101)]) @@ -126,7 +127,7 @@ def test_transaction_rollback_on_validation_failure(self, temp_db): writer.ensure_updated_at_column() # First update is valid - writer.update_job_status(1, "shortlist", timestamp) + writer.update_job_status(1, JobDbStatus.SHORTLIST, timestamp) # Simulate validation failure by checking for non-existent job missing = writer.validate_jobs_exist([1, 999]) @@ -143,7 +144,7 @@ def test_transaction_rollback_on_validation_failure(self, temp_db): row = cursor.fetchone() conn.close() - assert row[0] == "new" # Should still be original value + assert row[0] == JobDbStatus.NEW # Should still be original value def test_transaction_rollback_on_exception(self, temp_db): """Verify transaction rollback works when exception occurs.""" @@ -151,8 +152,8 @@ def test_transaction_rollback_on_exception(self, temp_db): try: with JobsWriter(temp_db) as writer: - writer.update_job_status(1, "shortlist", timestamp) - writer.update_job_status(2, "reviewed", timestamp) + writer.update_job_status(1, JobDbStatus.SHORTLIST, timestamp) + writer.update_job_status(2, JobDbStatus.REVIEWED, timestamp) # Raise exception before commit raise RuntimeError("Simulated error") except RuntimeError: @@ -164,8 +165,8 @@ def test_transaction_rollback_on_exception(self, temp_db): rows = cursor.fetchall() conn.close() - assert rows[0][0] == "new" # Job 1 should still be 'new' - assert rows[1][0] == "shortlist" # Job 2 should still be 'shortlist' + assert rows[0][0] == JobDbStatus.NEW # Job 1 should still be 'new' + assert rows[1][0] == JobDbStatus.SHORTLIST # Job 2 should still be 'shortlist' def test_database_connection_cleanup_on_success(self, temp_db): """Verify database connection is properly closed after success.""" @@ -256,9 +257,9 @@ def test_complete_workflow_with_validation_and_rollback(self, temp_db): """ # Prepare updates updates = [ - {"id": 1, "status": "shortlist"}, - {"id": 2, "status": "reviewed"}, - {"id": 999, "status": "new"}, # Non-existent job + {"id": 1, "status": JobDbStatus.SHORTLIST}, + {"id": 2, "status": JobDbStatus.REVIEWED}, + {"id": 999, "status": JobDbStatus.NEW}, # Non-existent job ] # Step 1: Validate batch size @@ -312,8 +313,8 @@ def test_complete_workflow_with_validation_and_rollback(self, temp_db): rows = cursor.fetchall() conn.close() - assert rows[0][0] == "new" # Job 1 should still be 'new' - assert rows[1][0] == "shortlist" # Job 2 should still be 'shortlist' + assert rows[0][0] == JobDbStatus.NEW # Job 1 should still be 'new' + assert rows[1][0] == JobDbStatus.SHORTLIST # Job 2 should still be 'shortlist' # Verify validation errors were collected assert len(validation_errors) == 1 @@ -326,7 +327,10 @@ def test_complete_workflow_with_successful_commit(self, temp_db): This simulates a successful workflow where all validations pass. """ # Prepare valid updates - updates = [{"id": 1, "status": "shortlist"}, {"id": 3, "status": "reviewed"}] + updates = [ + {"id": 1, "status": JobDbStatus.SHORTLIST}, + {"id": 3, "status": JobDbStatus.REVIEWED}, + ] # Validate batch size validate_batch_size(updates) @@ -376,10 +380,10 @@ def test_complete_workflow_with_successful_commit(self, temp_db): assert len(rows) == 2 assert rows[0]["id"] == 1 - assert rows[0]["status"] == "shortlist" + assert rows[0]["status"] == JobDbStatus.SHORTLIST assert rows[0]["updated_at"] == timestamp assert rows[1]["id"] == 3 - assert rows[1]["status"] == "reviewed" + assert rows[1]["status"] == JobDbStatus.REVIEWED assert rows[1]["updated_at"] == timestamp diff --git a/mcp-server-python/tests/test_file_ops.py b/mcp-server-python/tests/test_file_ops.py index 61207de..5424d70 100644 --- a/mcp-server-python/tests/test_file_ops.py +++ b/mcp-server-python/tests/test_file_ops.py @@ -525,25 +525,6 @@ def test_handles_unicode_in_template(self, tmp_path): assert action == "created" assert target_path.read_text(encoding="utf-8") == template_content - def test_uses_default_template_path(self, tmp_path, monkeypatch): - """Test that materialize_resume_tex uses default template path.""" - # Change to tmp_path and create default template location - monkeypatch.chdir(tmp_path) - default_template_path = tmp_path / "data" / "templates" / "resume_skeleton_example.tex" - default_template_path.parent.mkdir(parents=True, exist_ok=True) - template_content = r"\documentclass{article}\begin{document}Default Template\end{document}" - default_template_path.write_text(template_content) - - # Target path - target_path = tmp_path / "resume" / "resume.tex" - - # Materialize resume.tex using default template - action = materialize_resume_tex(target_path=str(target_path), force=False) - - # Verify default template was used - assert action == "created" - assert target_path.read_text() == template_content - def test_action_matrix_for_all_scenarios(self, tmp_path): """Test complete action matrix for materialize_resume_tex.""" # Create a template file diff --git a/mcp-server-python/tests/test_finalize_resume_batch.py b/mcp-server-python/tests/test_finalize_resume_batch.py index a4ecd63..3ae644c 100644 --- a/mcp-server-python/tests/test_finalize_resume_batch.py +++ b/mcp-server-python/tests/test_finalize_resume_batch.py @@ -5,15 +5,17 @@ DB updates, tracker synchronization, compensation fallback, and dry-run mode. """ +import sqlite3 from pathlib import Path + import pytest -import sqlite3 +from models.errors import ToolError +from models.status import JobDbStatus, JobTrackerStatus from tools.finalize_resume_batch import ( finalize_resume_batch, generate_run_id, sanitize_error_message, ) -from models.errors import ToolError class TestGenerateRunId: @@ -392,7 +394,7 @@ def test_resume_tex_with_placeholders(self, test_db, tmp_path): \documentclass{article} \begin{document} \section{Experience} -PROJECT-AI-placeholder content here +WORK-BULLET-POINT-placeholder content here \end{document} """ tex_path.write_text(tex_content) @@ -518,7 +520,7 @@ def test_successful_finalization(self, test_db, valid_tracker): row = cursor.fetchone() conn.close() - assert row["status"] == "resume_written" + assert row["status"] == JobDbStatus.RESUME_WRITTEN assert row["resume_pdf_path"] == pdf_path assert row["resume_written_at"] is not None assert row["run_id"] == result["run_id"] @@ -527,7 +529,7 @@ def test_successful_finalization(self, test_db, valid_tracker): # Verify tracker was updated tracker_content = Path(tracker_path).read_text() - assert "status: Resume Written" in tracker_content + assert f"status: {JobTrackerStatus.RESUME_WRITTEN.value}" in tracker_content assert "company: Amazon" in tracker_content # Other fields preserved def test_nonexistent_job_id_fails_without_mutating_tracker(self, test_db, valid_tracker): @@ -550,7 +552,7 @@ def test_nonexistent_job_id_fails_without_mutating_tracker(self, test_db, valid_ conn.close() assert missing is None - assert original["status"] == "reviewed" + assert original["status"] == JobDbStatus.REVIEWED assert original["attempt_count"] == 0 tracker_content = Path(tracker_path).read_text() @@ -670,7 +672,7 @@ def test_dry_run_no_db_mutation(self, test_db, valid_tracker): row = cursor.fetchone() conn.close() - assert row["status"] == "reviewed" # Unchanged + assert row["status"] == JobDbStatus.REVIEWED # Unchanged assert row["resume_pdf_path"] is None # Unchanged assert row["resume_written_at"] is None # Unchanged assert row["run_id"] is None # Unchanged @@ -880,12 +882,12 @@ def failing_sync(path: str, status: str): ).fetchone() conn.close() - assert row1["status"] == "reviewed" + assert row1["status"] == JobDbStatus.REVIEWED assert row1["last_error"] is not None assert "Tracker sync failed" in row1["last_error"] assert row1["attempt_count"] == 1 - assert row2["status"] == "resume_written" + assert row2["status"] == JobDbStatus.RESUME_WRITTEN assert row2["last_error"] is None assert row2["attempt_count"] == 1 diff --git a/mcp-server-python/tests/test_finalize_validators.py b/mcp-server-python/tests/test_finalize_validators.py index 4f5a257..0fa77b7 100644 --- a/mcp-server-python/tests/test_finalize_validators.py +++ b/mcp-server-python/tests/test_finalize_validators.py @@ -215,39 +215,39 @@ def test_clean_tex_passes(self, tmp_path): assert error is None assert found_tokens == [] - def test_tex_with_project_ai_placeholder_fails(self, tmp_path): - """Test that TEX with PROJECT-AI- placeholder fails.""" + def test_tex_with_bullet_point_placeholder_fails(self, tmp_path): + """Test that TEX with BULLET-POINT placeholder fails.""" tex_path = tmp_path / "resume.tex" tex_path.write_text(""" \\documentclass{article} \\begin{document} \\section{Projects} -PROJECT-AI-DESCRIPTION-HERE +PROJECT-BULLET-POINT-1 \\end{document} """) is_valid, error, found_tokens = scan_tex_for_placeholders(str(tex_path)) assert is_valid is False - assert "PROJECT-AI-" in error - assert "PROJECT-AI-" in found_tokens + assert "BULLET-POINT" in error + assert "BULLET-POINT" in found_tokens - def test_tex_with_project_be_placeholder_fails(self, tmp_path): - """Test that TEX with PROJECT-BE- placeholder fails.""" + def test_tex_without_bullet_point_passes(self, tmp_path): + """Test that TEX without BULLET-POINT marker passes validation.""" tex_path = tmp_path / "resume.tex" tex_path.write_text(""" \\documentclass{article} \\begin{document} \\section{Projects} -PROJECT-BE-DESCRIPTION-HERE +ML-INTERN-BULLET-1 \\end{document} """) is_valid, error, found_tokens = scan_tex_for_placeholders(str(tex_path)) - assert is_valid is False - assert "PROJECT-BE-" in error - assert "PROJECT-BE-" in found_tokens + assert is_valid is True + assert error is None + assert found_tokens == [] def test_tex_with_work_bullet_point_placeholder_fails(self, tmp_path): """Test that TEX with WORK-BULLET-POINT- placeholder fails.""" @@ -263,8 +263,8 @@ def test_tex_with_work_bullet_point_placeholder_fails(self, tmp_path): is_valid, error, found_tokens = scan_tex_for_placeholders(str(tex_path)) assert is_valid is False - assert "WORK-BULLET-POINT-" in error - assert "WORK-BULLET-POINT-" in found_tokens + assert "BULLET-POINT" in error + assert "BULLET-POINT" in found_tokens def test_tex_with_multiple_placeholders_fails(self, tmp_path): """Test that TEX with multiple placeholders reports all of them.""" @@ -273,8 +273,8 @@ def test_tex_with_multiple_placeholders_fails(self, tmp_path): \\documentclass{article} \\begin{document} \\section{Projects} -PROJECT-AI-DESCRIPTION -PROJECT-BE-DESCRIPTION +PROJECT-BULLET-POINT-1 +RESEARCH-2-BULLET-POINT-2 \\section{Experience} WORK-BULLET-POINT-PLACEHOLDER \\end{document} @@ -283,13 +283,9 @@ def test_tex_with_multiple_placeholders_fails(self, tmp_path): is_valid, error, found_tokens = scan_tex_for_placeholders(str(tex_path)) assert is_valid is False - assert "PROJECT-AI-" in error - assert "PROJECT-BE-" in error - assert "WORK-BULLET-POINT-" in error - assert len(found_tokens) == 3 - assert "PROJECT-AI-" in found_tokens - assert "PROJECT-BE-" in found_tokens - assert "WORK-BULLET-POINT-" in found_tokens + assert "BULLET-POINT" in error + assert len(found_tokens) == 1 + assert "BULLET-POINT" in found_tokens def test_tex_with_placeholder_in_comment_still_fails(self, tmp_path): """Test that placeholders in comments are still detected (conservative check).""" @@ -297,7 +293,7 @@ def test_tex_with_placeholder_in_comment_still_fails(self, tmp_path): tex_path.write_text(""" \\documentclass{article} \\begin{document} -% TODO: Replace PROJECT-AI-DESCRIPTION +% TODO: Replace PROJECT-BULLET-POINT-1 \\section{Projects} Real project description here. \\end{document} @@ -306,8 +302,8 @@ def test_tex_with_placeholder_in_comment_still_fails(self, tmp_path): is_valid, error, found_tokens = scan_tex_for_placeholders(str(tex_path)) assert is_valid is False - assert "PROJECT-AI-" in error - assert "PROJECT-AI-" in found_tokens + assert "BULLET-POINT" in error + assert "BULLET-POINT" in found_tokens def test_unreadable_tex_file_fails(self, tmp_path): """Test that unreadable TEX file returns error.""" @@ -391,14 +387,14 @@ def test_tex_with_placeholders_fails(self, tmp_path): tex_path.write_text(""" \\documentclass{article} \\begin{document} -PROJECT-AI-PLACEHOLDER +PROJECT-BULLET-POINT-1 \\end{document} """) is_valid, error = validate_resume_written_guardrails(str(pdf_path), str(tex_path)) assert is_valid is False - assert "PROJECT-AI-" in error + assert "BULLET-POINT" in error assert "placeholder tokens" in error def test_validation_stops_at_first_failure(self, tmp_path): @@ -418,11 +414,8 @@ class TestPlaceholderTokens: def test_placeholder_tokens_defined(self): """Test that all required placeholder tokens are defined.""" - assert "PROJECT-AI-" in PLACEHOLDER_TOKENS - assert "PROJECT-BE-" in PLACEHOLDER_TOKENS - assert "WORK-BULLET-POINT-" in PLACEHOLDER_TOKENS + assert "BULLET-POINT" in PLACEHOLDER_TOKENS def test_placeholder_tokens_count(self): """Test that we have at least the minimum required tokens.""" - # Requirements specify "at minimum" these three tokens - assert len(PLACEHOLDER_TOKENS) >= 3 + assert len(PLACEHOLDER_TOKENS) == 1 diff --git a/mcp-server-python/tests/test_finalize_validators_integration.py b/mcp-server-python/tests/test_finalize_validators_integration.py index df2d5e0..e68c6be 100644 --- a/mcp-server-python/tests/test_finalize_validators_integration.py +++ b/mcp-server-python/tests/test_finalize_validators_integration.py @@ -105,7 +105,7 @@ def test_complete_flow_fails_with_placeholders(self, tmp_path): \\documentclass{article} \\begin{document} \\section{Projects} -PROJECT-AI-DESCRIPTION-PLACEHOLDER +PROJECT-BULLET-POINT-1 \\end{document} """) @@ -119,7 +119,7 @@ def test_complete_flow_fails_with_placeholders(self, tmp_path): is_valid, error = validate_resume_written_guardrails(resolved_pdf, resolved_tex) assert is_valid is False - assert "PROJECT-AI-" in error + assert "BULLET-POINT" in error assert "placeholder tokens" in error def test_complete_flow_fails_with_zero_byte_pdf(self, tmp_path): diff --git a/mcp-server-python/tests/test_initialize_shortlist_trackers_tool.py b/mcp-server-python/tests/test_initialize_shortlist_trackers_tool.py index 87b8828..5eb41e5 100644 --- a/mcp-server-python/tests/test_initialize_shortlist_trackers_tool.py +++ b/mcp-server-python/tests/test_initialize_shortlist_trackers_tool.py @@ -7,7 +7,9 @@ import sqlite3 from pathlib import Path + import pytest +from models.status import JobDbStatus from tools.initialize_shortlist_trackers import initialize_shortlist_trackers @@ -1137,6 +1139,6 @@ def test_database_read_only_boundary(self, tmp_path, test_db): conn.close() for job_id, status in status_rows: - assert status == "shortlist", ( - f"Job {job_id} status changed from 'shortlist' to '{status}'" + assert status == JobDbStatus.SHORTLIST, ( + f"Job {job_id} status changed from '{JobDbStatus.SHORTLIST}' to '{status}'" ) diff --git a/mcp-server-python/tests/test_job_schema.py b/mcp-server-python/tests/test_job_schema.py index 2a9e7e3..e43911e 100644 --- a/mcp-server-python/tests/test_job_schema.py +++ b/mcp-server-python/tests/test_job_schema.py @@ -6,7 +6,9 @@ """ import json + from models.job import to_job_schema +from models.status import JobDbStatus class TestToJobSchema: @@ -23,7 +25,7 @@ def test_complete_row_mapping(self): "url": "https://www.linkedin.com/jobs/view/4368663835/", "location": "Toronto, ON", "source": "linkedin", - "status": "new", + "status": JobDbStatus.NEW, "captured_at": "2026-02-04T03:47:36.966Z", } @@ -38,7 +40,7 @@ def test_complete_row_mapping(self): assert result["url"] == "https://www.linkedin.com/jobs/view/4368663835/" assert result["location"] == "Toronto, ON" assert result["source"] == "linkedin" - assert result["status"] == "new" + assert result["status"] == JobDbStatus.NEW assert result["captured_at"] == "2026-02-04T03:47:36.966Z" def test_missing_fields_return_none(self): @@ -46,7 +48,7 @@ def test_missing_fields_return_none(self): row = { "id": 456, "url": "https://example.com/job", - "status": "new", + "status": JobDbStatus.NEW, "captured_at": "2026-02-05T10:00:00.000Z", } @@ -55,7 +57,7 @@ def test_missing_fields_return_none(self): # Present fields should have values assert result["id"] == 456 assert result["url"] == "https://example.com/job" - assert result["status"] == "new" + assert result["status"] == JobDbStatus.NEW assert result["captured_at"] == "2026-02-05T10:00:00.000Z" # Missing fields should be None @@ -77,7 +79,7 @@ def test_null_fields_return_none(self): "url": "https://example.com/job", "location": None, "source": None, - "status": "new", + "status": JobDbStatus.NEW, "captured_at": None, } @@ -92,7 +94,7 @@ def test_null_fields_return_none(self): assert result["url"] == "https://example.com/job" assert result["location"] is None assert result["source"] is None - assert result["status"] == "new" + assert result["status"] == JobDbStatus.NEW assert result["captured_at"] is None def test_empty_strings_return_none(self): @@ -106,7 +108,7 @@ def test_empty_strings_return_none(self): "url": "https://example.com/job", "location": "", "source": "", - "status": "new", + "status": JobDbStatus.NEW, "captured_at": "2026-02-05T10:00:00.000Z", } @@ -121,7 +123,7 @@ def test_empty_strings_return_none(self): assert result["url"] == "https://example.com/job" assert result["location"] is None assert result["source"] is None - assert result["status"] == "new" + assert result["status"] == JobDbStatus.NEW assert result["captured_at"] == "2026-02-05T10:00:00.000Z" def test_schema_stability_no_extra_fields(self): @@ -135,7 +137,7 @@ def test_schema_stability_no_extra_fields(self): "url": "https://example.com/job", "location": "San Francisco, CA", "source": "indeed", - "status": "new", + "status": JobDbStatus.NEW, "captured_at": "2026-02-05T10:00:00.000Z", # Extra fields that should NOT appear in output "extra_field": "should not appear", @@ -176,7 +178,7 @@ def test_json_serializability(self): "url": "https://example.com/job", "location": "New York, NY", "source": "glassdoor", - "status": "new", + "status": JobDbStatus.NEW, "captured_at": "2026-02-05T10:00:00.000Z", } @@ -201,7 +203,7 @@ def test_json_serializability_with_none_values(self): "url": "https://example.com/job", "location": None, "source": None, - "status": "new", + "status": JobDbStatus.NEW, "captured_at": None, } @@ -229,7 +231,7 @@ def test_special_characters_in_strings(self): "url": "https://example.com/job?id=123&ref=abc", "location": "City, ST", "source": "source/name", - "status": "new", + "status": JobDbStatus.NEW, "captured_at": "2026-02-05T10:00:00.000Z", } @@ -257,7 +259,7 @@ def test_unicode_characters(self): "url": "https://example.com/job", "location": "Montréal, QC", "source": "linkedin", - "status": "new", + "status": JobDbStatus.NEW, "captured_at": "2026-02-05T10:00:00.000Z", } @@ -286,7 +288,7 @@ def test_long_description(self): "url": "https://example.com/job", "location": "Location", "source": "source", - "status": "new", + "status": JobDbStatus.NEW, "captured_at": "2026-02-05T10:00:00.000Z", } @@ -311,7 +313,7 @@ def test_integer_id_types(self): "url": "https://example.com/job", "location": "Location", "source": "source", - "status": "new", + "status": JobDbStatus.NEW, "captured_at": "2026-02-05T10:00:00.000Z", } @@ -336,7 +338,7 @@ def test_consistent_field_order(self): "url": "https://example.com/job", "location": "Location", "source": "source", - "status": "new", + "status": JobDbStatus.NEW, "captured_at": "2026-02-05T10:00:00.000Z", } @@ -427,7 +429,7 @@ def test_requirement_3_2_no_arbitrary_columns(self): def test_requirement_3_3_missing_values_consistent(self): """Requirement 3.3: Handle missing values consistently.""" # Test with missing fields - row1 = {"id": 1, "url": "test", "status": "new"} + row1 = {"id": 1, "url": "test", "status": JobDbStatus.NEW} result1 = to_job_schema(row1) # Test with None fields @@ -440,7 +442,7 @@ def test_requirement_3_3_missing_values_consistent(self): "url": "test", "location": None, "source": None, - "status": "new", + "status": JobDbStatus.NEW, "captured_at": None, } result2 = to_job_schema(row2) diff --git a/mcp-server-python/tests/test_jobs_ingest_writer.py b/mcp-server-python/tests/test_jobs_ingest_writer.py index c3a944e..02ac96a 100644 --- a/mcp-server-python/tests/test_jobs_ingest_writer.py +++ b/mcp-server-python/tests/test_jobs_ingest_writer.py @@ -5,17 +5,18 @@ and boundary behavior for ingestion operations. """ -import pytest import sqlite3 -from pathlib import Path from datetime import datetime, timezone +from pathlib import Path +import pytest from db.jobs_ingest_writer import ( - resolve_db_path, - ensure_parent_dirs, - bootstrap_schema, JobsIngestWriter, + bootstrap_schema, + ensure_parent_dirs, + resolve_db_path, ) +from models.status import JobDbStatus class TestResolveDbPath: @@ -557,7 +558,7 @@ def test_default_status_is_new(self, tmp_path): cursor = conn.execute( "SELECT status FROM jobs WHERE url = ?", ("https://example.com/job1",) ) - assert cursor.fetchone()[0] == "new" + assert cursor.fetchone()[0] == JobDbStatus.NEW conn.close() def test_valid_status_override(self, tmp_path): @@ -565,7 +566,7 @@ def test_valid_status_override(self, tmp_path): db_path = tmp_path / "test.db" now = datetime.now(timezone.utc).isoformat() - valid_statuses = ["new", "shortlist", "reviewed", "reject", "resume_written", "applied"] + valid_statuses = list(JobDbStatus) for idx, status in enumerate(valid_statuses): record = { diff --git a/mcp-server-python/tests/test_jobs_reader.py b/mcp-server-python/tests/test_jobs_reader.py index 43260f7..0d311ff 100644 --- a/mcp-server-python/tests/test_jobs_reader.py +++ b/mcp-server-python/tests/test_jobs_reader.py @@ -4,13 +4,14 @@ Tests connection management, path resolution, and query execution. """ -import pytest import sqlite3 -from pathlib import Path from datetime import datetime, timezone +from pathlib import Path -from db.jobs_reader import resolve_db_path, get_connection, query_new_jobs -from models.errors import ToolError, ErrorCode +import pytest +from db.jobs_reader import get_connection, query_new_jobs, resolve_db_path +from models.errors import ErrorCode, ToolError +from models.status import JobDbStatus class TestResolveDbPath: @@ -225,9 +226,9 @@ def test_query_returns_only_new_status(self, tmp_path): """Test query filters by status='new'.""" db_path = tmp_path / "test.db" jobs = [ - {"url": "http://example.com/1", "status": "new", "title": "Job 1"}, - {"url": "http://example.com/2", "status": "new", "title": "Job 2"}, - {"url": "http://example.com/3", "status": "applied", "title": "Job 3"}, + {"url": "http://example.com/1", "status": JobDbStatus.NEW, "title": "Job 1"}, + {"url": "http://example.com/2", "status": JobDbStatus.NEW, "title": "Job 2"}, + {"url": "http://example.com/3", "status": JobDbStatus.APPLIED, "title": "Job 3"}, {"url": "http://example.com/4", "status": "rejected", "title": "Job 4"}, ] self.create_test_db(db_path, jobs) @@ -236,7 +237,7 @@ def test_query_returns_only_new_status(self, tmp_path): results = query_new_jobs(conn, limit=50) assert len(results) == 2 - assert all(r["status"] == "new" for r in results) + assert all(r["status"] == JobDbStatus.NEW for r in results) assert {r["title"] for r in results} == {"Job 1", "Job 2"} def test_query_respects_limit(self, tmp_path): @@ -492,10 +493,10 @@ def test_query_returns_only_shortlist_status(self, tmp_path): """Test query filters by status='shortlist'.""" db_path = tmp_path / "test.db" jobs = [ - {"url": "http://example.com/1", "status": "shortlist", "title": "Job 1"}, - {"url": "http://example.com/2", "status": "shortlist", "title": "Job 2"}, - {"url": "http://example.com/3", "status": "new", "title": "Job 3"}, - {"url": "http://example.com/4", "status": "applied", "title": "Job 4"}, + {"url": "http://example.com/1", "status": JobDbStatus.SHORTLIST, "title": "Job 1"}, + {"url": "http://example.com/2", "status": JobDbStatus.SHORTLIST, "title": "Job 2"}, + {"url": "http://example.com/3", "status": JobDbStatus.NEW, "title": "Job 3"}, + {"url": "http://example.com/4", "status": JobDbStatus.APPLIED, "title": "Job 4"}, ] self.create_test_db(db_path, jobs) @@ -505,14 +506,14 @@ def test_query_returns_only_shortlist_status(self, tmp_path): results = query_shortlist_jobs(conn, limit=50) assert len(results) == 2 - assert all(r["status"] == "shortlist" for r in results) + assert all(r["status"] == JobDbStatus.SHORTLIST for r in results) assert {r["title"] for r in results} == {"Job 1", "Job 2"} def test_query_respects_limit(self, tmp_path): """Test query respects the limit parameter.""" db_path = tmp_path / "test.db" jobs = [ - {"url": f"http://example.com/{i}", "title": f"Job {i}", "status": "shortlist"} + {"url": f"http://example.com/{i}", "title": f"Job {i}", "status": JobDbStatus.SHORTLIST} for i in range(10) ] self.create_test_db(db_path, jobs) diff --git a/mcp-server-python/tests/test_jobs_writer.py b/mcp-server-python/tests/test_jobs_writer.py index 44fa2c7..b493979 100644 --- a/mcp-server-python/tests/test_jobs_writer.py +++ b/mcp-server-python/tests/test_jobs_writer.py @@ -4,13 +4,14 @@ Tests transaction management, connection handling, and write operations. """ -import pytest +import os import sqlite3 import tempfile -import os +import pytest from db.jobs_writer import JobsWriter, resolve_db_path -from models.errors import ToolError, ErrorCode +from models.errors import ErrorCode, ToolError +from models.status import JobDbStatus @pytest.fixture @@ -295,8 +296,8 @@ def test_update_multiple_jobs(temp_db): timestamp = "2024-01-15T12:00:00.000Z" with JobsWriter(temp_db) as writer: - writer.update_job_status(1, "shortlist", timestamp) - writer.update_job_status(3, "reviewed", timestamp) + writer.update_job_status(1, JobDbStatus.SHORTLIST, timestamp) + writer.update_job_status(3, JobDbStatus.REVIEWED, timestamp) writer.commit() # Verify updates were applied @@ -308,10 +309,10 @@ def test_update_multiple_jobs(temp_db): assert len(rows) == 2 assert rows[0]["id"] == 1 - assert rows[0]["status"] == "shortlist" + assert rows[0]["status"] == JobDbStatus.SHORTLIST assert rows[0]["updated_at"] == timestamp assert rows[1]["id"] == 3 - assert rows[1]["status"] == "reviewed" + assert rows[1]["status"] == JobDbStatus.REVIEWED assert rows[1]["updated_at"] == timestamp @@ -526,8 +527,8 @@ def test_schema_preflight_allows_updates_when_column_exists(temp_db): assert missing == [] # Execute updates - writer.update_job_status(1, "shortlist", timestamp) - writer.update_job_status(2, "reviewed", timestamp) + writer.update_job_status(1, JobDbStatus.SHORTLIST, timestamp) + writer.update_job_status(2, JobDbStatus.REVIEWED, timestamp) # Commit transaction writer.commit() @@ -541,10 +542,10 @@ def test_schema_preflight_allows_updates_when_column_exists(temp_db): assert len(rows) == 2 assert rows[0]["id"] == 1 - assert rows[0]["status"] == "shortlist" + assert rows[0]["status"] == JobDbStatus.SHORTLIST assert rows[0]["updated_at"] == timestamp assert rows[1]["id"] == 2 - assert rows[1]["status"] == "reviewed" + assert rows[1]["status"] == JobDbStatus.REVIEWED assert rows[1]["updated_at"] == timestamp @@ -821,7 +822,7 @@ def test_fallback_to_reviewed_success(temp_db_with_finalize_columns): row = cursor.fetchone() conn.close() - assert row["status"] == "reviewed" + assert row["status"] == JobDbStatus.REVIEWED assert row["last_error"] == error_message assert row["attempt_count"] == 1 # Preserved (attempt already counted) assert row["updated_at"] == timestamp @@ -919,7 +920,7 @@ def test_fallback_to_reviewed_preserves_other_fields(temp_db_with_finalize_colum conn.close() # Status and error fields updated - assert row["status"] == "reviewed" + assert row["status"] == JobDbStatus.REVIEWED assert row["last_error"] == error_message assert row["attempt_count"] == 1 assert row["updated_at"] == timestamp diff --git a/mcp-server-python/tests/test_latex_compiler.py b/mcp-server-python/tests/test_latex_compiler.py index 649e37c..233c18e 100644 --- a/mcp-server-python/tests/test_latex_compiler.py +++ b/mcp-server-python/tests/test_latex_compiler.py @@ -37,7 +37,7 @@ def test_tex_with_placeholders_raises_validation_error(self, tmp_path): \\documentclass{article} \\begin{document} \\section{Projects} -PROJECT-AI-DESCRIPTION-HERE +PROJECT-BULLET-POINT-1 \\end{document} """) @@ -48,7 +48,7 @@ def test_tex_with_placeholders_raises_validation_error(self, tmp_path): error = exc_info.value assert error.code == ErrorCode.VALIDATION_ERROR assert "placeholder tokens" in error.message - assert "PROJECT-AI-" in error.message + assert "BULLET-POINT" in error.message # Verify PDF was NOT created (compile was skipped) pdf_path = tmp_path / "resume.pdf" @@ -61,8 +61,8 @@ def test_tex_with_multiple_placeholders_raises_validation_error(self, tmp_path): \\documentclass{article} \\begin{document} \\section{Projects} -PROJECT-AI-DESCRIPTION -PROJECT-BE-DESCRIPTION +PROJECT-BULLET-POINT-1 +PROJECT-BULLET-POINT-2 \\section{Experience} WORK-BULLET-POINT-PLACEHOLDER \\end{document} @@ -74,10 +74,7 @@ def test_tex_with_multiple_placeholders_raises_validation_error(self, tmp_path): error = exc_info.value assert error.code == ErrorCode.VALIDATION_ERROR assert "placeholder tokens" in error.message - # Should mention at least one placeholder - assert any( - token in error.message for token in ["PROJECT-AI-", "PROJECT-BE-", "WORK-BULLET-POINT-"] - ) + assert "BULLET-POINT" in error.message @patch("subprocess.run") def test_successful_compilation(self, mock_run, tmp_path): diff --git a/mcp-server-python/tests/test_latex_guardrails.py b/mcp-server-python/tests/test_latex_guardrails.py index 02509f3..eb57f92 100644 --- a/mcp-server-python/tests/test_latex_guardrails.py +++ b/mcp-server-python/tests/test_latex_guardrails.py @@ -33,39 +33,39 @@ def test_clean_tex_passes(self, tmp_path): assert error is None assert found_tokens == [] - def test_tex_with_project_ai_placeholder_fails(self, tmp_path): - """Test that TEX with PROJECT-AI- placeholder fails.""" + def test_tex_with_bullet_point_placeholder_fails(self, tmp_path): + """Test that TEX with BULLET-POINT placeholder fails.""" tex_path = tmp_path / "resume.tex" tex_path.write_text(""" \\documentclass{article} \\begin{document} \\section{Projects} -PROJECT-AI-DESCRIPTION-HERE +RESEARCH-1-BULLET-POINT-1 \\end{document} """) is_valid, error, found_tokens = scan_tex_for_placeholders(str(tex_path)) assert is_valid is False - assert "PROJECT-AI-" in error - assert "PROJECT-AI-" in found_tokens + assert "BULLET-POINT" in error + assert "BULLET-POINT" in found_tokens - def test_tex_with_project_be_placeholder_fails(self, tmp_path): - """Test that TEX with PROJECT-BE- placeholder fails.""" + def test_tex_without_bullet_point_passes(self, tmp_path): + """Test that TEX without BULLET-POINT marker passes validation.""" tex_path = tmp_path / "resume.tex" tex_path.write_text(""" \\documentclass{article} \\begin{document} \\section{Projects} -PROJECT-BE-DESCRIPTION-HERE +PROJECT-BULLET-1 \\end{document} """) is_valid, error, found_tokens = scan_tex_for_placeholders(str(tex_path)) - assert is_valid is False - assert "PROJECT-BE-" in error - assert "PROJECT-BE-" in found_tokens + assert is_valid is True + assert error is None + assert found_tokens == [] def test_tex_with_work_bullet_point_placeholder_fails(self, tmp_path): """Test that TEX with WORK-BULLET-POINT- placeholder fails.""" @@ -81,8 +81,8 @@ def test_tex_with_work_bullet_point_placeholder_fails(self, tmp_path): is_valid, error, found_tokens = scan_tex_for_placeholders(str(tex_path)) assert is_valid is False - assert "WORK-BULLET-POINT-" in error - assert "WORK-BULLET-POINT-" in found_tokens + assert "BULLET-POINT" in error + assert "BULLET-POINT" in found_tokens def test_tex_with_multiple_placeholders_fails(self, tmp_path): """Test that TEX with multiple placeholders reports all of them.""" @@ -91,8 +91,8 @@ def test_tex_with_multiple_placeholders_fails(self, tmp_path): \\documentclass{article} \\begin{document} \\section{Projects} -PROJECT-AI-DESCRIPTION -PROJECT-BE-DESCRIPTION +PROJECT-BULLET-1 +RESEARCH-2-BULLET-POINT-2 \\section{Experience} WORK-BULLET-POINT-PLACEHOLDER \\end{document} @@ -101,13 +101,9 @@ def test_tex_with_multiple_placeholders_fails(self, tmp_path): is_valid, error, found_tokens = scan_tex_for_placeholders(str(tex_path)) assert is_valid is False - assert "PROJECT-AI-" in error - assert "PROJECT-BE-" in error - assert "WORK-BULLET-POINT-" in error - assert len(found_tokens) == 3 - assert "PROJECT-AI-" in found_tokens - assert "PROJECT-BE-" in found_tokens - assert "WORK-BULLET-POINT-" in found_tokens + assert "BULLET-POINT" in error + assert len(found_tokens) == 1 + assert "BULLET-POINT" in found_tokens def test_tex_with_placeholder_in_comment_still_fails(self, tmp_path): """Test that placeholders in comments are still detected (conservative check).""" @@ -115,7 +111,7 @@ def test_tex_with_placeholder_in_comment_still_fails(self, tmp_path): tex_path.write_text(""" \\documentclass{article} \\begin{document} -% TODO: Replace PROJECT-AI-DESCRIPTION +% TODO: Replace PROJECT-BULLET-1 \\section{Projects} Real project description here. \\end{document} @@ -123,9 +119,9 @@ def test_tex_with_placeholder_in_comment_still_fails(self, tmp_path): is_valid, error, found_tokens = scan_tex_for_placeholders(str(tex_path)) - assert is_valid is False - assert "PROJECT-AI-" in error - assert "PROJECT-AI-" in found_tokens + assert is_valid is True + assert error is None + assert found_tokens == [] def test_unreadable_tex_file_fails(self, tmp_path): """Test that unreadable TEX file returns error.""" @@ -163,9 +159,7 @@ def test_returns_all_required_tokens(self): """Test that function returns all required placeholder tokens.""" tokens = get_placeholder_tokens() - assert "PROJECT-AI-" in tokens - assert "PROJECT-BE-" in tokens - assert "WORK-BULLET-POINT-" in tokens + assert "BULLET-POINT" in tokens def test_returns_copy_not_reference(self): """Test that function returns a copy, not a reference to the original list.""" @@ -183,8 +177,7 @@ def test_minimum_token_count(self): """Test that we have at least the minimum required tokens.""" tokens = get_placeholder_tokens() - # Requirements specify "at minimum" these three tokens - assert len(tokens) >= 3 + assert len(tokens) == 1 class TestPlaceholderTokensConstant: @@ -192,11 +185,8 @@ class TestPlaceholderTokensConstant: def test_placeholder_tokens_defined(self): """Test that all required placeholder tokens are defined.""" - assert "PROJECT-AI-" in PLACEHOLDER_TOKENS - assert "PROJECT-BE-" in PLACEHOLDER_TOKENS - assert "WORK-BULLET-POINT-" in PLACEHOLDER_TOKENS + assert "BULLET-POINT" in PLACEHOLDER_TOKENS def test_placeholder_tokens_count(self): """Test that we have at least the minimum required tokens.""" - # Requirements specify "at minimum" these three tokens - assert len(PLACEHOLDER_TOKENS) >= 3 + assert len(PLACEHOLDER_TOKENS) == 1 diff --git a/mcp-server-python/tests/test_scrape_jobs_tool.py b/mcp-server-python/tests/test_scrape_jobs_tool.py index 352ce08..6392c8e 100644 --- a/mcp-server-python/tests/test_scrape_jobs_tool.py +++ b/mcp-server-python/tests/test_scrape_jobs_tool.py @@ -643,24 +643,6 @@ def test_dry_run_mode(self): assert response["results"][0]["inserted_count"] == 0 assert response["results"][0]["duplicate_count"] == 0 - def test_uses_default_parameters(self): - """Test that default parameters are used when not provided.""" - raw_records = [] - - with patch("tools.scrape_jobs.scrape_jobs_for_term", return_value=raw_records): - with patch("tools.scrape_jobs.JobsIngestWriter") as mock_writer_class: - mock_writer = MagicMock() - mock_writer.insert_cleaned_records.return_value = (0, 0) - mock_writer_class.return_value.__enter__.return_value = mock_writer - - response = scrape_jobs() # No parameters - - # Should use default terms - assert len(response["results"]) == 3 - assert response["results"][0]["term"] == "ai engineer" - assert response["results"][1]["term"] == "backend engineer" - assert response["results"][2]["term"] == "machine learning" - def test_unknown_parameter_rejected(self): """Test that unknown parameters are rejected.""" with pytest.raises(ToolError) as exc_info: diff --git a/mcp-server-python/tests/test_server_bulk_update_integration.py b/mcp-server-python/tests/test_server_bulk_update_integration.py index 8f123fb..a04304b 100644 --- a/mcp-server-python/tests/test_server_bulk_update_integration.py +++ b/mcp-server-python/tests/test_server_bulk_update_integration.py @@ -5,13 +5,13 @@ with proper metadata and that the tool can be invoked through the MCP interface. """ -import sqlite3 import os +import sqlite3 import tempfile import pytest - -from server import mcp, bulk_update_job_status_tool +from models.status import JobDbStatus +from server import bulk_update_job_status_tool, mcp class TestBulkUpdateServerIntegration: @@ -96,7 +96,7 @@ def test_tool_function_can_be_called_directly(self, temp_db): """Test that the tool function can be called directly.""" # Call the tool function directly result = bulk_update_job_status_tool( - updates=[{"id": 1, "status": "shortlist"}], db_path=temp_db + updates=[{"id": 1, "status": JobDbStatus.SHORTLIST}], db_path=temp_db ) # Should succeed @@ -110,9 +110,9 @@ def test_tool_function_with_multiple_updates(self, temp_db): """Test that the tool function handles multiple updates.""" result = bulk_update_job_status_tool( updates=[ - {"id": 1, "status": "shortlist"}, - {"id": 2, "status": "reviewed"}, - {"id": 3, "status": "reject"}, + {"id": 1, "status": JobDbStatus.SHORTLIST}, + {"id": 2, "status": JobDbStatus.REVIEWED}, + {"id": 3, "status": JobDbStatus.REJECT}, ], db_path=temp_db, ) @@ -149,7 +149,7 @@ def test_tool_function_handles_validation_errors(self, temp_db): def test_tool_function_handles_nonexistent_job(self, temp_db): """Test that the tool function handles nonexistent job IDs.""" result = bulk_update_job_status_tool( - updates=[{"id": 999, "status": "shortlist"}], db_path=temp_db + updates=[{"id": 999, "status": JobDbStatus.SHORTLIST}], db_path=temp_db ) # Should return per-item failure @@ -162,7 +162,8 @@ def test_tool_function_handles_database_errors(self): """Test that the tool function returns structured database errors.""" # Call with non-existent database result = bulk_update_job_status_tool( - updates=[{"id": 1, "status": "shortlist"}], db_path="/nonexistent/path/to/db.db" + updates=[{"id": 1, "status": JobDbStatus.SHORTLIST}], + db_path="/nonexistent/path/to/db.db", ) # Should return error structure @@ -182,8 +183,8 @@ def test_tool_function_atomicity(self, temp_db): # Attempt batch with one invalid update result = bulk_update_job_status_tool( updates=[ - {"id": 1, "status": "shortlist"}, # Valid - {"id": 999, "status": "reviewed"}, # Invalid (doesn't exist) + {"id": 1, "status": JobDbStatus.SHORTLIST}, # Valid + {"id": 999, "status": JobDbStatus.REVIEWED}, # Invalid (doesn't exist) ], db_path=temp_db, ) @@ -203,7 +204,7 @@ def test_tool_function_idempotency(self, temp_db): """Test that the tool function supports idempotent updates.""" # Update job to its current status result = bulk_update_job_status_tool( - updates=[{"id": 1, "status": "new"}], # Job 1 already has status 'new' + updates=[{"id": 1, "status": JobDbStatus.NEW}], # Job 1 already has status 'new' db_path=temp_db, ) @@ -216,9 +217,9 @@ def test_tool_function_timestamp_consistency(self, temp_db): """Test that all jobs in a batch get the same timestamp.""" result = bulk_update_job_status_tool( updates=[ - {"id": 1, "status": "shortlist"}, - {"id": 2, "status": "reviewed"}, - {"id": 3, "status": "reject"}, + {"id": 1, "status": JobDbStatus.SHORTLIST}, + {"id": 2, "status": JobDbStatus.REVIEWED}, + {"id": 3, "status": JobDbStatus.REJECT}, ], db_path=temp_db, ) @@ -237,7 +238,7 @@ def test_tool_function_timestamp_consistency(self, temp_db): def test_tool_function_write_only_guarantee(self, temp_db): """Test that the tool function doesn't return job data.""" result = bulk_update_job_status_tool( - updates=[{"id": 1, "status": "shortlist"}], db_path=temp_db + updates=[{"id": 1, "status": JobDbStatus.SHORTLIST}], db_path=temp_db ) # Should not contain job details (title, company, description, etc.) @@ -299,7 +300,7 @@ def test_server_can_be_imported(self): """Test that the server module can be imported without errors.""" # This test verifies that all imports in server.py are valid # and that the module initializes correctly - from server import mcp, bulk_update_job_status_tool, main + from server import bulk_update_job_status_tool, main, mcp assert mcp is not None assert bulk_update_job_status_tool is not None diff --git a/mcp-server-python/tests/test_tracker_policy.py b/mcp-server-python/tests/test_tracker_policy.py index b162ba6..34652a7 100644 --- a/mcp-server-python/tests/test_tracker_policy.py +++ b/mcp-server-python/tests/test_tracker_policy.py @@ -5,14 +5,15 @@ """ import pytest +from models.errors import ErrorCode, ToolError +from models.status import JobTrackerStatus from utils.tracker_policy import ( - validate_transition, - check_transition_or_raise, - TransitionResult, - TERMINAL_STATUSES, CORE_TRANSITIONS, + TERMINAL_STATUSES, + TransitionResult, + check_transition_or_raise, + validate_transition, ) -from models.errors import ToolError, ErrorCode class TestTransitionResult: @@ -77,7 +78,7 @@ class TestValidateTransitionNoop: def test_noop_reviewed_to_reviewed(self): """Test noop when staying in Reviewed status.""" - result = validate_transition("Reviewed", "Reviewed") + result = validate_transition(JobTrackerStatus.REVIEWED, JobTrackerStatus.REVIEWED) assert result.allowed is True assert result.is_noop is True assert result.error_message is None @@ -85,37 +86,39 @@ def test_noop_reviewed_to_reviewed(self): def test_noop_resume_written_to_resume_written(self): """Test noop when staying in Resume Written status.""" - result = validate_transition("Resume Written", "Resume Written") + result = validate_transition( + JobTrackerStatus.RESUME_WRITTEN, JobTrackerStatus.RESUME_WRITTEN + ) assert result.allowed is True assert result.is_noop is True def test_noop_applied_to_applied(self): """Test noop when staying in Applied status.""" - result = validate_transition("Applied", "Applied") + result = validate_transition(JobTrackerStatus.APPLIED, JobTrackerStatus.APPLIED) assert result.allowed is True assert result.is_noop is True def test_noop_interview_to_interview(self): """Test noop when staying in Interview status.""" - result = validate_transition("Interview", "Interview") + result = validate_transition(JobTrackerStatus.INTERVIEW, JobTrackerStatus.INTERVIEW) assert result.allowed is True assert result.is_noop is True def test_noop_offer_to_offer(self): """Test noop when staying in Offer status.""" - result = validate_transition("Offer", "Offer") + result = validate_transition(JobTrackerStatus.OFFER, JobTrackerStatus.OFFER) assert result.allowed is True assert result.is_noop is True def test_noop_rejected_to_rejected(self): """Test noop when staying in Rejected status.""" - result = validate_transition("Rejected", "Rejected") + result = validate_transition(JobTrackerStatus.REJECTED, JobTrackerStatus.REJECTED) assert result.allowed is True assert result.is_noop is True def test_noop_ghosted_to_ghosted(self): """Test noop when staying in Ghosted status.""" - result = validate_transition("Ghosted", "Ghosted") + result = validate_transition(JobTrackerStatus.GHOSTED, JobTrackerStatus.GHOSTED) assert result.allowed is True assert result.is_noop is True @@ -125,7 +128,7 @@ class TestValidateTransitionCoreForward: def test_reviewed_to_resume_written(self): """Test allowed transition from Reviewed to Resume Written.""" - result = validate_transition("Reviewed", "Resume Written") + result = validate_transition(JobTrackerStatus.REVIEWED, JobTrackerStatus.RESUME_WRITTEN) assert result.allowed is True assert result.is_noop is False assert result.error_message is None @@ -133,7 +136,7 @@ def test_reviewed_to_resume_written(self): def test_resume_written_to_applied(self): """Test allowed transition from Resume Written to Applied.""" - result = validate_transition("Resume Written", "Applied") + result = validate_transition(JobTrackerStatus.RESUME_WRITTEN, JobTrackerStatus.APPLIED) assert result.allowed is True assert result.is_noop is False assert result.error_message is None @@ -145,62 +148,62 @@ class TestValidateTransitionTerminalOutcomes: def test_reviewed_to_rejected(self): """Test allowed transition from Reviewed to Rejected.""" - result = validate_transition("Reviewed", "Rejected") + result = validate_transition(JobTrackerStatus.REVIEWED, JobTrackerStatus.REJECTED) assert result.allowed is True assert result.is_noop is False assert result.warnings == [] def test_reviewed_to_ghosted(self): """Test allowed transition from Reviewed to Ghosted.""" - result = validate_transition("Reviewed", "Ghosted") + result = validate_transition(JobTrackerStatus.REVIEWED, JobTrackerStatus.GHOSTED) assert result.allowed is True assert result.is_noop is False def test_resume_written_to_rejected(self): """Test allowed transition from Resume Written to Rejected.""" - result = validate_transition("Resume Written", "Rejected") + result = validate_transition(JobTrackerStatus.RESUME_WRITTEN, JobTrackerStatus.REJECTED) assert result.allowed is True assert result.is_noop is False def test_resume_written_to_ghosted(self): """Test allowed transition from Resume Written to Ghosted.""" - result = validate_transition("Resume Written", "Ghosted") + result = validate_transition(JobTrackerStatus.RESUME_WRITTEN, JobTrackerStatus.GHOSTED) assert result.allowed is True assert result.is_noop is False def test_applied_to_rejected(self): """Test allowed transition from Applied to Rejected.""" - result = validate_transition("Applied", "Rejected") + result = validate_transition(JobTrackerStatus.APPLIED, JobTrackerStatus.REJECTED) assert result.allowed is True assert result.is_noop is False def test_applied_to_ghosted(self): """Test allowed transition from Applied to Ghosted.""" - result = validate_transition("Applied", "Ghosted") + result = validate_transition(JobTrackerStatus.APPLIED, JobTrackerStatus.GHOSTED) assert result.allowed is True assert result.is_noop is False def test_interview_to_rejected(self): """Test allowed transition from Interview to Rejected.""" - result = validate_transition("Interview", "Rejected") + result = validate_transition(JobTrackerStatus.INTERVIEW, JobTrackerStatus.REJECTED) assert result.allowed is True assert result.is_noop is False def test_interview_to_ghosted(self): """Test allowed transition from Interview to Ghosted.""" - result = validate_transition("Interview", "Ghosted") + result = validate_transition(JobTrackerStatus.INTERVIEW, JobTrackerStatus.GHOSTED) assert result.allowed is True assert result.is_noop is False def test_offer_to_rejected(self): """Test allowed transition from Offer to Rejected.""" - result = validate_transition("Offer", "Rejected") + result = validate_transition(JobTrackerStatus.OFFER, JobTrackerStatus.REJECTED) assert result.allowed is True assert result.is_noop is False def test_offer_to_ghosted(self): """Test allowed transition from Offer to Ghosted.""" - result = validate_transition("Offer", "Ghosted") + result = validate_transition(JobTrackerStatus.OFFER, JobTrackerStatus.GHOSTED) assert result.allowed is True assert result.is_noop is False @@ -210,7 +213,7 @@ class TestValidateTransitionPolicyViolations: def test_applied_to_reviewed_blocked(self): """Test blocked backward transition from Applied to Reviewed.""" - result = validate_transition("Applied", "Reviewed") + result = validate_transition(JobTrackerStatus.APPLIED, JobTrackerStatus.REVIEWED) assert result.allowed is False assert result.is_noop is False assert result.error_message is not None @@ -220,57 +223,60 @@ def test_applied_to_reviewed_blocked(self): def test_resume_written_to_reviewed_blocked(self): """Test blocked backward transition from Resume Written to Reviewed.""" - result = validate_transition("Resume Written", "Reviewed") + result = validate_transition(JobTrackerStatus.RESUME_WRITTEN, JobTrackerStatus.REVIEWED) assert result.allowed is False assert result.error_message is not None assert "violates policy" in result.error_message def test_applied_to_resume_written_blocked(self): """Test blocked backward transition from Applied to Resume Written.""" - result = validate_transition("Applied", "Resume Written") + result = validate_transition(JobTrackerStatus.APPLIED, JobTrackerStatus.RESUME_WRITTEN) assert result.allowed is False assert result.error_message is not None assert "violates policy" in result.error_message def test_reviewed_to_applied_blocked(self): """Test blocked skip transition from Reviewed to Applied.""" - result = validate_transition("Reviewed", "Applied") + result = validate_transition(JobTrackerStatus.REVIEWED, JobTrackerStatus.APPLIED) assert result.allowed is False assert result.error_message is not None assert "violates policy" in result.error_message def test_reviewed_to_interview_blocked(self): """Test blocked skip transition from Reviewed to Interview.""" - result = validate_transition("Reviewed", "Interview") + result = validate_transition(JobTrackerStatus.REVIEWED, JobTrackerStatus.INTERVIEW) assert result.allowed is False assert result.error_message is not None def test_reviewed_to_offer_blocked(self): """Test blocked skip transition from Reviewed to Offer.""" - result = validate_transition("Reviewed", "Offer") + result = validate_transition(JobTrackerStatus.REVIEWED, JobTrackerStatus.OFFER) assert result.allowed is False assert result.error_message is not None def test_applied_to_interview_blocked(self): """Test blocked transition from Applied to Interview (not in core).""" - result = validate_transition("Applied", "Interview") + result = validate_transition(JobTrackerStatus.APPLIED, JobTrackerStatus.INTERVIEW) assert result.allowed is False assert result.error_message is not None def test_interview_to_applied_blocked(self): """Test blocked backward transition from Interview to Applied.""" - result = validate_transition("Interview", "Applied") + result = validate_transition(JobTrackerStatus.INTERVIEW, JobTrackerStatus.APPLIED) assert result.allowed is False assert result.error_message is not None def test_error_message_includes_allowed_transitions(self): """Test that error message includes allowed transitions.""" - result = validate_transition("Reviewed", "Applied") + result = validate_transition(JobTrackerStatus.REVIEWED, JobTrackerStatus.APPLIED) assert result.error_message is not None # Should mention Resume Written as the allowed forward transition - assert "Resume Written" in result.error_message + assert JobTrackerStatus.RESUME_WRITTEN.value in result.error_message # Should mention terminal outcomes - assert "Rejected" in result.error_message or "Ghosted" in result.error_message + assert ( + JobTrackerStatus.REJECTED.value in result.error_message + or JobTrackerStatus.GHOSTED.value in result.error_message + ) class TestValidateTransitionForceBypass: @@ -278,7 +284,9 @@ class TestValidateTransitionForceBypass: def test_force_bypass_applied_to_reviewed(self): """Test force bypass allows backward transition.""" - result = validate_transition("Applied", "Reviewed", force=True) + result = validate_transition( + JobTrackerStatus.APPLIED, JobTrackerStatus.REVIEWED, force=True + ) assert result.allowed is True assert result.is_noop is False assert result.error_message is None @@ -289,27 +297,35 @@ def test_force_bypass_applied_to_reviewed(self): def test_force_bypass_resume_written_to_reviewed(self): """Test force bypass allows backward transition.""" - result = validate_transition("Resume Written", "Reviewed", force=True) + result = validate_transition( + JobTrackerStatus.RESUME_WRITTEN, JobTrackerStatus.REVIEWED, force=True + ) assert result.allowed is True assert len(result.warnings) > 0 assert "Force bypass" in result.warnings[0] def test_force_bypass_reviewed_to_applied(self): """Test force bypass allows skip transition.""" - result = validate_transition("Reviewed", "Applied", force=True) + result = validate_transition( + JobTrackerStatus.REVIEWED, JobTrackerStatus.APPLIED, force=True + ) assert result.allowed is True assert len(result.warnings) > 0 assert "Force bypass" in result.warnings[0] def test_force_bypass_reviewed_to_interview(self): """Test force bypass allows skip transition.""" - result = validate_transition("Reviewed", "Interview", force=True) + result = validate_transition( + JobTrackerStatus.REVIEWED, JobTrackerStatus.INTERVIEW, force=True + ) assert result.allowed is True assert len(result.warnings) > 0 def test_force_bypass_warning_message_clarity(self): """Test that force bypass warning is clear and descriptive.""" - result = validate_transition("Applied", "Reviewed", force=True) + result = validate_transition( + JobTrackerStatus.APPLIED, JobTrackerStatus.REVIEWED, force=True + ) warning = result.warnings[0] assert "Force bypass" in warning assert "violates policy" in warning @@ -320,17 +336,23 @@ def test_force_bypass_warning_message_clarity(self): def test_force_does_not_affect_allowed_transitions(self): """Test that force flag doesn't add warnings to allowed transitions.""" # Core forward transition with force should not have warnings - result = validate_transition("Reviewed", "Resume Written", force=True) + result = validate_transition( + JobTrackerStatus.REVIEWED, JobTrackerStatus.RESUME_WRITTEN, force=True + ) assert result.allowed is True assert result.warnings == [] # Terminal outcome with force should not have warnings - result = validate_transition("Reviewed", "Rejected", force=True) + result = validate_transition( + JobTrackerStatus.REVIEWED, JobTrackerStatus.REJECTED, force=True + ) assert result.allowed is True assert result.warnings == [] # Noop with force should not have warnings - result = validate_transition("Reviewed", "Reviewed", force=True) + result = validate_transition( + JobTrackerStatus.REVIEWED, JobTrackerStatus.REVIEWED, force=True + ) assert result.allowed is True assert result.warnings == [] @@ -340,25 +362,27 @@ class TestCheckTransitionOrRaise: def test_allowed_transition_returns_result(self): """Test that allowed transition returns result without raising.""" - result = check_transition_or_raise("Reviewed", "Resume Written") + result = check_transition_or_raise( + JobTrackerStatus.REVIEWED, JobTrackerStatus.RESUME_WRITTEN + ) assert result.allowed is True assert result.is_noop is False def test_noop_transition_returns_result(self): """Test that noop transition returns result without raising.""" - result = check_transition_or_raise("Reviewed", "Reviewed") + result = check_transition_or_raise(JobTrackerStatus.REVIEWED, JobTrackerStatus.REVIEWED) assert result.allowed is True assert result.is_noop is True def test_terminal_outcome_returns_result(self): """Test that terminal outcome returns result without raising.""" - result = check_transition_or_raise("Reviewed", "Rejected") + result = check_transition_or_raise(JobTrackerStatus.REVIEWED, JobTrackerStatus.REJECTED) assert result.allowed is True def test_blocked_transition_raises_tool_error(self): """Test that blocked transition raises ToolError.""" with pytest.raises(ToolError) as exc_info: - check_transition_or_raise("Applied", "Reviewed") + check_transition_or_raise(JobTrackerStatus.APPLIED, JobTrackerStatus.REVIEWED) error = exc_info.value assert error.code == ErrorCode.VALIDATION_ERROR @@ -367,19 +391,21 @@ def test_blocked_transition_raises_tool_error(self): def test_force_bypass_returns_result_with_warning(self): """Test that force bypass returns result without raising.""" - result = check_transition_or_raise("Applied", "Reviewed", force=True) + result = check_transition_or_raise( + JobTrackerStatus.APPLIED, JobTrackerStatus.REVIEWED, force=True + ) assert result.allowed is True assert len(result.warnings) > 0 def test_error_message_matches_validation_result(self): """Test that raised error message matches validation result.""" # Get the error message from validate_transition - validation_result = validate_transition("Applied", "Reviewed") + validation_result = validate_transition(JobTrackerStatus.APPLIED, JobTrackerStatus.REVIEWED) expected_message = validation_result.error_message # Check that check_transition_or_raise raises with same message with pytest.raises(ToolError) as exc_info: - check_transition_or_raise("Applied", "Reviewed") + check_transition_or_raise(JobTrackerStatus.APPLIED, JobTrackerStatus.REVIEWED) error = exc_info.value assert error.message == expected_message @@ -390,14 +416,14 @@ class TestTransitionPolicyConstants: def test_terminal_statuses_set(self): """Test that terminal statuses are correctly defined.""" - assert "Rejected" in TERMINAL_STATUSES - assert "Ghosted" in TERMINAL_STATUSES + assert JobTrackerStatus.REJECTED in TERMINAL_STATUSES + assert JobTrackerStatus.GHOSTED in TERMINAL_STATUSES assert len(TERMINAL_STATUSES) == 2 def test_core_transitions_dict(self): """Test that core transitions are correctly defined.""" - assert CORE_TRANSITIONS["Reviewed"] == "Resume Written" - assert CORE_TRANSITIONS["Resume Written"] == "Applied" + assert CORE_TRANSITIONS[JobTrackerStatus.REVIEWED] == JobTrackerStatus.RESUME_WRITTEN + assert CORE_TRANSITIONS[JobTrackerStatus.RESUME_WRITTEN] == JobTrackerStatus.APPLIED assert len(CORE_TRANSITIONS) == 2 def test_core_transitions_keys_are_strings(self): @@ -416,8 +442,10 @@ class TestEdgeCases: def test_force_false_explicit(self): """Test that force=False behaves same as default.""" - result_default = validate_transition("Applied", "Reviewed") - result_explicit = validate_transition("Applied", "Reviewed", force=False) + result_default = validate_transition(JobTrackerStatus.APPLIED, JobTrackerStatus.REVIEWED) + result_explicit = validate_transition( + JobTrackerStatus.APPLIED, JobTrackerStatus.REVIEWED, force=False + ) assert result_default.allowed == result_explicit.allowed assert result_default.is_noop == result_explicit.is_noop @@ -427,10 +455,10 @@ def test_multiple_violations_same_behavior(self): """Test that different violations have consistent behavior.""" # All these should be blocked without force violations = [ - ("Applied", "Reviewed"), - ("Resume Written", "Reviewed"), - ("Reviewed", "Applied"), - ("Interview", "Applied"), + (JobTrackerStatus.APPLIED, JobTrackerStatus.REVIEWED), + (JobTrackerStatus.RESUME_WRITTEN, JobTrackerStatus.REVIEWED), + (JobTrackerStatus.REVIEWED, JobTrackerStatus.APPLIED), + (JobTrackerStatus.INTERVIEW, JobTrackerStatus.APPLIED), ] for current, target in violations: @@ -440,21 +468,11 @@ def test_multiple_violations_same_behavior(self): def test_all_statuses_can_reach_terminal(self): """Test that all statuses can transition to terminal outcomes.""" - all_statuses = [ - "Reviewed", - "Resume Written", - "Applied", - "Interview", - "Offer", - "Rejected", - "Ghosted", - ] - - for status in all_statuses: + for status in JobTrackerStatus: # Should be able to reach Rejected - result = validate_transition(status, "Rejected") + result = validate_transition(status, JobTrackerStatus.REJECTED) assert result.allowed is True # Should be able to reach Ghosted - result = validate_transition(status, "Ghosted") + result = validate_transition(status, JobTrackerStatus.GHOSTED) assert result.allowed is True diff --git a/mcp-server-python/tests/test_tracker_sync.py b/mcp-server-python/tests/test_tracker_sync.py index af496e4..7f727f2 100644 --- a/mcp-server-python/tests/test_tracker_sync.py +++ b/mcp-server-python/tests/test_tracker_sync.py @@ -8,9 +8,10 @@ - Error handling """ -import pytest import os +import pytest +from models.status import JobTrackerStatus from utils.tracker_sync import update_tracker_status @@ -52,7 +53,7 @@ def test_update_status_preserves_frontmatter_and_body(tmp_path): tracker_path.write_text(original_content, encoding="utf-8") # Update status - update_tracker_status(str(tracker_path), "Resume Written") + update_tracker_status(str(tracker_path), JobTrackerStatus.RESUME_WRITTEN) # Read updated content updated_content = tracker_path.read_text(encoding="utf-8") @@ -96,7 +97,7 @@ def test_update_status_to_same_value(tmp_path): tracker_path.write_text(original_content, encoding="utf-8") # Update status to same value - update_tracker_status(str(tracker_path), "Reviewed") + update_tracker_status(str(tracker_path), JobTrackerStatus.REVIEWED) # Read updated content updated_content = tracker_path.read_text(encoding="utf-8") diff --git a/mcp-server-python/tests/test_update_tracker_status_tool.py b/mcp-server-python/tests/test_update_tracker_status_tool.py index eec87f0..193ff73 100644 --- a/mcp-server-python/tests/test_update_tracker_status_tool.py +++ b/mcp-server-python/tests/test_update_tracker_status_tool.py @@ -6,7 +6,9 @@ """ from pathlib import Path + import pytest +from models.status import JobTrackerStatus from tools.update_tracker_status import update_tracker_status @@ -122,7 +124,7 @@ def test_resume_with_placeholders(self, tmp_path): def test_missing_tracker_path(self): """Test that missing tracker_path returns VALIDATION_ERROR.""" - result = update_tracker_status({"target_status": "Resume Written"}) + result = update_tracker_status({"target_status": JobTrackerStatus.RESUME_WRITTEN}) assert "error" in result assert result["error"]["code"] == "VALIDATION_ERROR" @@ -151,7 +153,7 @@ def test_unknown_parameter(self, test_tracker): result = update_tracker_status( { "tracker_path": test_tracker, - "target_status": "Resume Written", + "target_status": JobTrackerStatus.RESUME_WRITTEN, "unknown_param": "value", } ) @@ -167,7 +169,10 @@ def test_unknown_parameter(self, test_tracker): def test_tracker_not_found(self): """Test that missing tracker file returns FILE_NOT_FOUND.""" result = update_tracker_status( - {"tracker_path": "nonexistent/tracker.md", "target_status": "Resume Written"} + { + "tracker_path": "nonexistent/tracker.md", + "target_status": JobTrackerStatus.RESUME_WRITTEN, + } ) assert "error" in result @@ -179,12 +184,14 @@ def test_tracker_not_found(self): def test_noop_same_status(self, test_tracker): """Test that setting status to current value returns noop.""" - result = update_tracker_status({"tracker_path": test_tracker, "target_status": "Reviewed"}) + result = update_tracker_status( + {"tracker_path": test_tracker, "target_status": JobTrackerStatus.REVIEWED} + ) assert result["success"] is True assert result["action"] == "noop" - assert result["previous_status"] == "Reviewed" - assert result["target_status"] == "Reviewed" + assert result["previous_status"] == JobTrackerStatus.REVIEWED + assert result["target_status"] == JobTrackerStatus.REVIEWED assert result["dry_run"] is False # ======================================================================== @@ -194,13 +201,13 @@ def test_noop_same_status(self, test_tracker): def test_successful_forward_transition(self, test_tracker): """Test successful forward transition from Reviewed to Resume Written.""" result = update_tracker_status( - {"tracker_path": test_tracker, "target_status": "Resume Written"} + {"tracker_path": test_tracker, "target_status": JobTrackerStatus.RESUME_WRITTEN} ) assert result["success"] is True assert result["action"] == "updated" - assert result["previous_status"] == "Reviewed" - assert result["target_status"] == "Resume Written" + assert result["previous_status"] == JobTrackerStatus.REVIEWED + assert result["target_status"] == JobTrackerStatus.RESUME_WRITTEN assert result["guardrail_check_passed"] is True assert result["warnings"] == [] @@ -214,12 +221,14 @@ def test_successful_forward_transition(self, test_tracker): def test_successful_transition_to_terminal_status(self, test_tracker): """Test successful transition to terminal status (Rejected).""" - result = update_tracker_status({"tracker_path": test_tracker, "target_status": "Rejected"}) + result = update_tracker_status( + {"tracker_path": test_tracker, "target_status": JobTrackerStatus.REJECTED} + ) assert result["success"] is True assert result["action"] == "updated" - assert result["previous_status"] == "Reviewed" - assert result["target_status"] == "Rejected" + assert result["previous_status"] == JobTrackerStatus.REVIEWED + assert result["target_status"] == JobTrackerStatus.REJECTED # Verify file was updated tracker_path = Path(test_tracker) @@ -229,13 +238,16 @@ def test_successful_transition_to_terminal_status(self, test_tracker): def test_transition_resume_written_to_applied(self, test_tracker_with_resume_written): """Test successful transition from Resume Written to Applied.""" result = update_tracker_status( - {"tracker_path": test_tracker_with_resume_written, "target_status": "Applied"} + { + "tracker_path": test_tracker_with_resume_written, + "target_status": JobTrackerStatus.APPLIED, + } ) assert result["success"] is True assert result["action"] == "updated" - assert result["previous_status"] == "Resume Written" - assert result["target_status"] == "Applied" + assert result["previous_status"] == JobTrackerStatus.RESUME_WRITTEN + assert result["target_status"] == JobTrackerStatus.APPLIED # ======================================================================== # Test: Blocked Transitions (Policy Violations) @@ -244,7 +256,10 @@ def test_transition_resume_written_to_applied(self, test_tracker_with_resume_wri def test_blocked_backward_transition(self, test_tracker_with_resume_written): """Test that backward transition is blocked without force.""" result = update_tracker_status( - {"tracker_path": test_tracker_with_resume_written, "target_status": "Reviewed"} + { + "tracker_path": test_tracker_with_resume_written, + "target_status": JobTrackerStatus.REVIEWED, + } ) assert result["success"] is False @@ -261,7 +276,7 @@ def test_force_bypass_policy_violation(self, test_tracker_with_resume_written): result = update_tracker_status( { "tracker_path": test_tracker_with_resume_written, - "target_status": "Reviewed", + "target_status": JobTrackerStatus.REVIEWED, "force": True, } ) @@ -295,7 +310,7 @@ def test_resume_written_blocked_missing_pdf(self, tmp_path): tracker_path.write_text(content) result = update_tracker_status( - {"tracker_path": str(tracker_path), "target_status": "Resume Written"} + {"tracker_path": str(tracker_path), "target_status": JobTrackerStatus.RESUME_WRITTEN} ) assert result["success"] is False @@ -320,7 +335,7 @@ def test_resume_written_blocked_placeholder_in_tex( tracker_path.write_text(content) result = update_tracker_status( - {"tracker_path": str(tracker_path), "target_status": "Resume Written"} + {"tracker_path": str(tracker_path), "target_status": JobTrackerStatus.RESUME_WRITTEN} ) assert result["success"] is False @@ -342,7 +357,7 @@ def test_resume_written_missing_resume_path(self, tmp_path): tracker_path.write_text(content) result = update_tracker_status( - {"tracker_path": str(tracker_path), "target_status": "Resume Written"} + {"tracker_path": str(tracker_path), "target_status": JobTrackerStatus.RESUME_WRITTEN} ) assert "error" in result @@ -357,7 +372,11 @@ def test_resume_written_missing_resume_path(self, tmp_path): def test_dry_run_successful_transition(self, test_tracker): """Test dry-run mode returns predicted action without writing.""" result = update_tracker_status( - {"tracker_path": test_tracker, "target_status": "Resume Written", "dry_run": True} + { + "tracker_path": test_tracker, + "target_status": JobTrackerStatus.RESUME_WRITTEN, + "dry_run": True, + } ) assert result["success"] is True @@ -375,7 +394,7 @@ def test_dry_run_blocked_transition(self, test_tracker_with_resume_written): result = update_tracker_status( { "tracker_path": test_tracker_with_resume_written, - "target_status": "Reviewed", + "target_status": JobTrackerStatus.REVIEWED, "dry_run": True, } ) @@ -405,7 +424,11 @@ def test_dry_run_guardrail_failure(self, tmp_path): tracker_path.write_text(content) result = update_tracker_status( - {"tracker_path": str(tracker_path), "target_status": "Resume Written", "dry_run": True} + { + "tracker_path": str(tracker_path), + "target_status": JobTrackerStatus.RESUME_WRITTEN, + "dry_run": True, + } ) assert result["success"] is False @@ -424,7 +447,7 @@ def test_content_preservation(self, test_tracker): # Update status result = update_tracker_status( - {"tracker_path": test_tracker, "target_status": "Resume Written"} + {"tracker_path": test_tracker, "target_status": JobTrackerStatus.RESUME_WRITTEN} ) assert result["success"] is True @@ -453,7 +476,7 @@ def test_content_preservation(self, test_tracker): def test_response_structure_success(self, test_tracker): """Test that success response has all required fields.""" result = update_tracker_status( - {"tracker_path": test_tracker, "target_status": "Resume Written"} + {"tracker_path": test_tracker, "target_status": JobTrackerStatus.RESUME_WRITTEN} ) # Required fields @@ -471,7 +494,10 @@ def test_response_structure_success(self, test_tracker): def test_response_structure_blocked(self, test_tracker_with_resume_written): """Test that blocked response has all required fields.""" result = update_tracker_status( - {"tracker_path": test_tracker_with_resume_written, "target_status": "Reviewed"} + { + "tracker_path": test_tracker_with_resume_written, + "target_status": JobTrackerStatus.REVIEWED, + } ) # Required fields @@ -489,7 +515,7 @@ def test_response_structure_blocked(self, test_tracker_with_resume_written): def test_response_structure_top_level_error(self): """Test that top-level error has correct structure.""" result = update_tracker_status( - {"tracker_path": "nonexistent.md", "target_status": "Resume Written"} + {"tracker_path": "nonexistent.md", "target_status": JobTrackerStatus.RESUME_WRITTEN} ) assert "error" in result diff --git a/mcp-server-python/tests/test_validation.py b/mcp-server-python/tests/test_validation.py index 462d9ff..75649e1 100644 --- a/mcp-server-python/tests/test_validation.py +++ b/mcp-server-python/tests/test_validation.py @@ -5,19 +5,20 @@ """ import pytest +from models.errors import ErrorCode, ToolError +from models.status import JobDbStatus, JobTrackerStatus from utils.validation import ( - validate_limit, - validate_db_path, - validate_cursor, + DEFAULT_LIMIT, + MIN_LIMIT, validate_all_parameters, - validate_status, - validate_job_id, validate_batch_size, + validate_cursor, + validate_db_path, + validate_job_id, + validate_limit, + validate_status, validate_unique_job_ids, - DEFAULT_LIMIT, - MIN_LIMIT, ) -from models.errors import ToolError, ErrorCode class TestValidateLimit: @@ -355,9 +356,8 @@ class TestValidateStatus: def test_valid_statuses(self): """Test that all valid status values are accepted.""" - valid_statuses = ["new", "shortlist", "reviewed", "reject", "resume_written", "applied"] - for status in valid_statuses: - result = validate_status(status) + for status in JobDbStatus: + result = validate_status(status.value) assert result == status def test_invalid_status_value(self): @@ -472,12 +472,8 @@ def test_status_error_includes_allowed_values(self): error = exc_info.value # Check that error message includes the allowed values - assert "new" in error.message - assert "shortlist" in error.message - assert "reviewed" in error.message - assert "reject" in error.message - assert "resume_written" in error.message - assert "applied" in error.message + for status in JobDbStatus: + assert status.value in error.message class TestValidateJobId: @@ -840,6 +836,7 @@ class TestGetCurrentUtcTimestamp: def test_timestamp_format_matches_iso8601(self): """Test that timestamp matches ISO 8601 format with milliseconds and Z suffix.""" import re + from utils.validation import get_current_utc_timestamp timestamp = get_current_utc_timestamp() @@ -874,7 +871,8 @@ def test_timestamp_includes_milliseconds(self): def test_timestamp_is_current(self): """Test that timestamp represents current time (within reasonable bounds).""" - from datetime import datetime, timezone, timedelta + from datetime import datetime, timedelta, timezone + from utils.validation import get_current_utc_timestamp before = datetime.now(timezone.utc) @@ -920,6 +918,7 @@ def test_timestamp_example_format(self): def test_multiple_calls_produce_valid_timestamps(self): """Test that multiple calls all produce valid timestamps.""" import re + from utils.validation import get_current_utc_timestamp pattern = r"^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}Z$" @@ -933,6 +932,7 @@ def test_multiple_calls_produce_valid_timestamps(self): def test_timestamp_parseable_by_datetime(self): """Test that timestamp can be parsed back to datetime object.""" from datetime import datetime + from utils.validation import get_current_utc_timestamp timestamp = get_current_utc_timestamp() @@ -1037,17 +1037,8 @@ def test_valid_tracker_statuses(self): """Test that all valid tracker status values are accepted.""" from utils.validation import validate_tracker_status - valid_statuses = [ - "Reviewed", - "Resume Written", - "Applied", - "Interview", - "Offer", - "Rejected", - "Ghosted", - ] - for status in valid_statuses: - result = validate_tracker_status(status) + for status in JobTrackerStatus: + result = validate_tracker_status(status.value) assert result == status def test_invalid_tracker_status_value(self): @@ -1151,13 +1142,8 @@ def test_tracker_status_error_includes_allowed_values(self): error = exc_info.value # Check that error message includes the allowed values - assert "Reviewed" in error.message - assert "Resume Written" in error.message - assert "Applied" in error.message - assert "Interview" in error.message - assert "Offer" in error.message - assert "Rejected" in error.message - assert "Ghosted" in error.message + for status in JobTrackerStatus: + assert status.value in error.message class TestValidateUpdateTrackerStatusParameters: @@ -1301,18 +1287,9 @@ def test_all_statuses_accepted(self): """Test that all valid tracker statuses are accepted.""" from utils.validation import validate_update_tracker_status_parameters - valid_statuses = [ - "Reviewed", - "Resume Written", - "Applied", - "Interview", - "Offer", - "Rejected", - "Ghosted", - ] - for status in valid_statuses: + for status in JobTrackerStatus: tracker_path, target_status, dry_run, force = validate_update_tracker_status_parameters( - tracker_path="trackers/test.md", target_status=status + tracker_path="trackers/test.md", target_status=status.value ) assert target_status == status diff --git a/mcp-server-python/tests/test_validation_scrape_jobs.py b/mcp-server-python/tests/test_validation_scrape_jobs.py index 6463fdd..c7cf720 100644 --- a/mcp-server-python/tests/test_validation_scrape_jobs.py +++ b/mcp-server-python/tests/test_validation_scrape_jobs.py @@ -8,43 +8,40 @@ """ import pytest +from config import config +from models.errors import ErrorCode, ToolError +from models.status import JobDbStatus from utils.validation import ( - validate_scrape_terms, - validate_results_wanted, - validate_hours_old, - validate_retry_count, - validate_retry_sleep_seconds, - validate_retry_backoff, - validate_scrape_status, - validate_scrape_jobs_parameters, - DEFAULT_SCRAPE_TERMS, - DEFAULT_RESULTS_WANTED, - MIN_RESULTS_WANTED, - MAX_RESULTS_WANTED, - DEFAULT_HOURS_OLD, - MIN_HOURS_OLD, MAX_HOURS_OLD, - DEFAULT_RETRY_COUNT, - MIN_RETRY_COUNT, + MAX_RESULTS_WANTED, + MAX_RETRY_BACKOFF, MAX_RETRY_COUNT, - DEFAULT_RETRY_SLEEP_SECONDS, - MIN_RETRY_SLEEP_SECONDS, MAX_RETRY_SLEEP_SECONDS, - DEFAULT_RETRY_BACKOFF, + MIN_HOURS_OLD, + MIN_RESULTS_WANTED, MIN_RETRY_BACKOFF, - MAX_RETRY_BACKOFF, + MIN_RETRY_COUNT, + MIN_RETRY_SLEEP_SECONDS, + validate_hours_old, + validate_results_wanted, + validate_retry_backoff, + validate_retry_count, + validate_retry_sleep_seconds, + validate_scrape_jobs_parameters, + validate_scrape_status, + validate_scrape_terms, ) -from models.errors import ToolError, ErrorCode class TestValidateScrapeTerms: """Tests for terms parameter validation.""" - def test_default_terms_when_none(self): - """Test that None returns the default terms list.""" + def test_default_terms_when_none(self, monkeypatch): + """Test that None returns the default terms list from config.""" + test_terms = ["test-term-1", "test-term-2"] + monkeypatch.setattr(config, "scrape_terms", test_terms) result = validate_scrape_terms(None) - assert result == DEFAULT_SCRAPE_TERMS - assert result == ["ai engineer", "backend engineer", "machine learning"] + assert result == test_terms def test_valid_single_term(self): """Test that single term list is accepted.""" @@ -101,11 +98,11 @@ def test_terms_with_empty_string_element_raises_error(self): class TestValidateResultsWanted: """Tests for results_wanted parameter validation.""" - def test_default_results_wanted_when_none(self): - """Test that None returns the default results_wanted.""" + def test_default_results_wanted_when_none(self, monkeypatch): + """Test that None returns the default results_wanted from config.""" + monkeypatch.setattr(config, "scrape_results_wanted", 25) result = validate_results_wanted(None) - assert result == DEFAULT_RESULTS_WANTED - assert result == 20 + assert result == 25 def test_valid_results_wanted_in_range(self): """Test that valid results_wanted within range are accepted.""" @@ -185,11 +182,11 @@ def test_results_wanted_invalid_type_boolean_raises_error(self): class TestValidateHoursOld: """Tests for hours_old parameter validation.""" - def test_default_hours_old_when_none(self): - """Test that None returns the default hours_old.""" + def test_default_hours_old_when_none(self, monkeypatch): + """Test that None returns the default hours_old from config.""" + monkeypatch.setattr(config, "scrape_hours_old", 3) result = validate_hours_old(None) - assert result == DEFAULT_HOURS_OLD - assert result == 2 + assert result == 3 def test_valid_hours_old_in_range(self): """Test that valid hours_old within range are accepted.""" @@ -269,11 +266,11 @@ def test_hours_old_invalid_type_boolean_raises_error(self): class TestValidateRetryCount: """Tests for retry_count parameter validation.""" - def test_default_retry_count_when_none(self): - """Test that None returns the default retry_count.""" + def test_default_retry_count_when_none(self, monkeypatch): + """Test that None returns the default retry_count from config.""" + monkeypatch.setattr(config, "scrape_retry_count", 5) result = validate_retry_count(None) - assert result == DEFAULT_RETRY_COUNT - assert result == 3 + assert result == 5 def test_valid_retry_count_in_range(self): """Test that valid retry_count within range are accepted.""" @@ -335,11 +332,11 @@ def test_retry_count_invalid_type_boolean_raises_error(self): class TestValidateRetrySleepSeconds: """Tests for retry_sleep_seconds parameter validation.""" - def test_default_retry_sleep_seconds_when_none(self): - """Test that None returns the default retry_sleep_seconds.""" + def test_default_retry_sleep_seconds_when_none(self, monkeypatch): + """Test that None returns the default retry_sleep_seconds from config.""" + monkeypatch.setattr(config, "scrape_retry_sleep_seconds", 45.5) result = validate_retry_sleep_seconds(None) - assert result == DEFAULT_RETRY_SLEEP_SECONDS - assert result == 30 + assert result == 45.5 def test_valid_retry_sleep_seconds_in_range(self): """Test that valid retry_sleep_seconds within range are accepted.""" @@ -397,11 +394,11 @@ def test_retry_sleep_seconds_invalid_type_boolean_raises_error(self): class TestValidateRetryBackoff: """Tests for retry_backoff parameter validation.""" - def test_default_retry_backoff_when_none(self): - """Test that None returns the default retry_backoff.""" + def test_default_retry_backoff_when_none(self, monkeypatch): + """Test that None returns the default retry_backoff from config.""" + monkeypatch.setattr(config, "scrape_retry_backoff", 2.5) result = validate_retry_backoff(None) - assert result == DEFAULT_RETRY_BACKOFF - assert result == 2 + assert result == 2.5 def test_valid_retry_backoff_in_range(self): """Test that valid retry_backoff within range are accepted.""" @@ -462,13 +459,12 @@ class TestValidateScrapeStatus: def test_default_status_when_none(self): """Test that None returns the default status 'new'.""" result = validate_scrape_status(None) - assert result == "new" + assert result == JobDbStatus.NEW def test_valid_statuses(self): """Test that all valid status values are accepted.""" - valid_statuses = ["new", "shortlist", "reviewed", "reject", "resume_written", "applied"] - for status in valid_statuses: - result = validate_scrape_status(status) + for status in JobDbStatus: + result = validate_scrape_status(status.value) assert result == status def test_invalid_status_value_raises_error(self): @@ -522,24 +518,37 @@ def test_status_invalid_type_raises_error(self): class TestValidateScrapeJobsParameters: """Tests for validating all scrape_jobs parameters together.""" - def test_all_defaults(self): - """Test validation with all default values.""" + def test_all_defaults(self, monkeypatch): + """Test validation with all default values from config.""" + monkeypatch.setattr(config, "scrape_terms", ["a", "b"]) + monkeypatch.setattr(config, "scrape_location", "Test Location") + monkeypatch.setattr(config, "scrape_sites", ["site1"]) + monkeypatch.setattr(config, "scrape_results_wanted", 99) + monkeypatch.setattr(config, "scrape_hours_old", 9) + monkeypatch.setattr(config, "scrape_require_description", False) + monkeypatch.setattr(config, "scrape_preflight_host", "test.host") + monkeypatch.setattr(config, "scrape_retry_count", 8) + monkeypatch.setattr(config, "scrape_retry_sleep_seconds", 7.0) + monkeypatch.setattr(config, "scrape_retry_backoff", 6.0) + monkeypatch.setattr(config, "scrape_save_capture_json", False) + monkeypatch.setattr(config, "scrape_capture_dir", "test/dir") + result = validate_scrape_jobs_parameters() - assert result["terms"] == DEFAULT_SCRAPE_TERMS - assert result["location"] == "Ontario, Canada" - assert result["sites"] == ["linkedin"] - assert result["results_wanted"] == 20 - assert result["hours_old"] == 2 + assert result["terms"] == ["a", "b"] + assert result["location"] == "Test Location" + assert result["sites"] == ["site1"] + assert result["results_wanted"] == 99 + assert result["hours_old"] == 9 assert result["db_path"] is None - assert result["status"] == "new" - assert result["require_description"] is True - assert result["preflight_host"] == "www.linkedin.com" - assert result["retry_count"] == 3 - assert result["retry_sleep_seconds"] == 30 - assert result["retry_backoff"] == 2 - assert result["save_capture_json"] is True - assert result["capture_dir"] == "data/capture" + assert result["status"] == JobDbStatus.NEW + assert result["require_description"] is False + assert result["preflight_host"] == "test.host" + assert result["retry_count"] == 8 + assert result["retry_sleep_seconds"] == 7.0 + assert result["retry_backoff"] == 6.0 + assert result["save_capture_json"] is False + assert result["capture_dir"] == "test/dir" assert result["dry_run"] is False def test_all_valid_custom_values(self): diff --git a/mcp-server-python/tools/bulk_read_new_jobs.py b/mcp-server-python/tools/bulk_read_new_jobs.py index 06a768f..b924c50 100644 --- a/mcp-server-python/tools/bulk_read_new_jobs.py +++ b/mcp-server-python/tools/bulk_read_new_jobs.py @@ -6,16 +6,14 @@ for jobs with status='new'. """ -from typing import Dict, Any +from typing import Any, Dict +from db.jobs_reader import get_connection, query_new_jobs +from models.errors import ToolError, create_internal_error from pydantic import ValidationError - -from schemas.bulk_read_new_jobs import BulkReadNewJobsRequest, BulkReadNewJobsResponse +from schemas.bulk_read_new_jobs import BulkReadNewJobsRequest, BulkReadNewJobsResponse, JobRecord from utils.cursor import decode_cursor -from db.jobs_reader import get_connection, query_new_jobs from utils.pagination import paginate_results -from models.job import to_job_schema -from models.errors import ToolError, create_internal_error from utils.pydantic_error_mapper import map_pydantic_validation_error @@ -79,9 +77,9 @@ def bulk_read_new_jobs(args: Dict[str, Any]) -> Dict[str, Any]: # Returns (page, has_more, next_cursor) page, has_more, next_cursor = paginate_results(rows, request.limit) - # Step 5: Map database rows to stable output schema - # Ensures only fixed fields are included, no arbitrary columns - jobs = [to_job_schema(row) for row in page] + # Step 5: Map database rows to stable output schema via Pydantic + # JobRecord ignores extra columns and normalises empty strings to None + jobs = [JobRecord.model_validate(row) for row in page] # Step 6: Build and return response return BulkReadNewJobsResponse( diff --git a/mcp-server-python/tools/career_tailor.py b/mcp-server-python/tools/career_tailor.py index c958079..9577981 100644 --- a/mcp-server-python/tools/career_tailor.py +++ b/mcp-server-python/tools/career_tailor.py @@ -18,6 +18,7 @@ from pydantic import ValidationError +from config import config from schemas.career_tailor import CareerTailorRequest, CareerTailorResponse from utils.pydantic_error_mapper import map_pydantic_validation_error from utils.validation import validate_career_tailor_batch_parameters @@ -30,10 +31,10 @@ # Default paths -DEFAULT_FULL_RESUME_PATH = "data/templates/full_resume_example.md" -DEFAULT_RESUME_TEMPLATE_PATH = "data/templates/resume_skeleton_example.tex" -DEFAULT_APPLICATIONS_DIR = "data/applications" -DEFAULT_PDFLATEX_CMD = "pdflatex" +DEFAULT_FULL_RESUME_PATH = config.full_resume_path +DEFAULT_RESUME_TEMPLATE_PATH = config.resume_template_path +DEFAULT_APPLICATIONS_DIR = config.applications_dir +DEFAULT_PDFLATEX_CMD = config.pdflatex_cmd def generate_run_id(prefix: str = "tailor") -> str: diff --git a/mcp-server-python/tools/finalize_resume_batch.py b/mcp-server-python/tools/finalize_resume_batch.py index c98b008..33d999b 100644 --- a/mcp-server-python/tools/finalize_resume_batch.py +++ b/mcp-server-python/tools/finalize_resume_batch.py @@ -10,26 +10,26 @@ - Tracker status is a synchronized projection for Obsidian workflow. """ -from datetime import datetime, timezone -from typing import Dict, Any, List, Optional import hashlib import re - -from pydantic import ValidationError +from datetime import datetime, timezone +from typing import Any, Dict, List, Optional from db.jobs_writer import JobsWriter +from models.errors import ToolError, create_internal_error +from models.status import JobTrackerStatus +from pydantic import ValidationError from schemas.finalize_resume_batch import FinalizeResumeBatchRequest, FinalizeResumeBatchResponse +from utils.artifact_paths import resolve_resume_tex_path +from utils.finalize_validators import validate_resume_written_guardrails, validate_tracker_exists +from utils.pydantic_error_mapper import map_pydantic_validation_error +from utils.tracker_parser import resolve_resume_pdf_path_from_tracker +from utils.tracker_sync import update_tracker_status from utils.validation import ( - validate_finalize_resume_batch_parameters, - validate_finalize_item, get_current_utc_timestamp, + validate_finalize_item, + validate_finalize_resume_batch_parameters, ) -from utils.tracker_parser import resolve_resume_pdf_path_from_tracker -from utils.artifact_paths import resolve_resume_tex_path -from utils.finalize_validators import validate_tracker_exists, validate_resume_written_guardrails -from utils.tracker_sync import update_tracker_status -from models.errors import ToolError, create_internal_error -from utils.pydantic_error_mapper import map_pydantic_validation_error def generate_run_id() -> str: @@ -239,7 +239,7 @@ def process_item_finalize( # Step 2: Update tracker frontmatter status try: - update_tracker_status(tracker_path, "Resume Written") + update_tracker_status(tracker_path, JobTrackerStatus.RESUME_WRITTEN) # Success: both DB and tracker updated return { diff --git a/mcp-server-python/tools/update_tracker_status.py b/mcp-server-python/tools/update_tracker_status.py index 812b509..9cd480a 100644 --- a/mcp-server-python/tools/update_tracker_status.py +++ b/mcp-server-python/tools/update_tracker_status.py @@ -5,19 +5,19 @@ to provide a safe tracker status update tool with Resume Written guardrails. """ -from typing import Dict, Any, Optional, List +from typing import Any, Dict, List, Optional +from models.errors import ToolError, create_internal_error, create_validation_error +from models.status import JobTrackerStatus from pydantic import ValidationError - from schemas.update_tracker_status import UpdateTrackerStatusRequest, UpdateTrackerStatusResponse +from utils.artifact_paths import ArtifactPathError, resolve_artifact_paths +from utils.finalize_validators import validate_resume_written_guardrails from utils.pydantic_error_mapper import map_pydantic_validation_error -from utils.validation import validate_update_tracker_status_parameters from utils.tracker_parser import parse_tracker_with_error_mapping from utils.tracker_policy import validate_transition -from utils.artifact_paths import resolve_artifact_paths, ArtifactPathError -from utils.finalize_validators import validate_resume_written_guardrails from utils.tracker_sync import update_tracker_status as write_tracker_status -from models.errors import ToolError, create_validation_error, create_internal_error +from utils.validation import validate_update_tracker_status_parameters def _build_response( @@ -206,7 +206,7 @@ def update_tracker_status(args: Dict[str, Any]) -> Dict[str, Any]: # Step 5: If target_status='Resume Written', perform artifact guardrails guardrail_check_passed = None - if target_status == "Resume Written": + if target_status == JobTrackerStatus.RESUME_WRITTEN: # Step 5a: Resolve artifact paths (Requirements 6.1-6.5) resume_path_raw = tracker_data["frontmatter"].get("resume_path") diff --git a/mcp-server-python/utils/cursor.py b/mcp-server-python/utils/cursor.py index d3f43f5..097bdaa 100644 --- a/mcp-server-python/utils/cursor.py +++ b/mcp-server-python/utils/cursor.py @@ -7,7 +7,23 @@ import base64 import json from typing import Optional, Tuple + from models.errors import create_validation_error +from pydantic import BaseModel, ConfigDict, ValidationError + + +class CursorPayload(BaseModel): + """Pydantic model for cursor payload validation. + + Uses strict mode so that e.g. ``"123"`` is rejected for ``id`` + (must be a real int) and ``123`` is rejected for ``captured_at`` + (must be a real str). Extra fields are forbidden. + """ + + model_config = ConfigDict(strict=True, extra="forbid") + + captured_at: str + id: int def encode_cursor(captured_at: str, record_id: int) -> str: @@ -31,6 +47,31 @@ def encode_cursor(captured_at: str, record_id: int) -> str: return encoded +def _map_cursor_validation_error(error: ValidationError) -> Exception: + """Convert a Pydantic ``ValidationError`` into a ``ToolError``. + + Preserves field names and type keywords (e.g. *string*, *integer*) + so that existing tests continue to pass. + """ + issues = error.errors() + if not issues: + return create_validation_error("Invalid cursor format") + + first = issues[0] + field = first.get("loc", ("",))[0] + error_type = first.get("type", "") + msg = first.get("msg", "invalid value") + + # Pydantic says "Input should be a valid integer" / "… valid string" + # which already contains the keywords the tests assert on. + if error_type == "missing": + return create_validation_error(f"Invalid cursor format: missing '{field}' field") + + # For type errors, include the Pydantic message directly so that + # keywords like "string" and "integer" are present. + return create_validation_error(f"Invalid cursor format: '{field}' {msg.lower()}") + + def decode_cursor(cursor: Optional[str]) -> Optional[Tuple[str, int]]: """ Decode an opaque cursor string into pagination state. @@ -55,28 +96,16 @@ def decode_cursor(cursor: Optional[str]) -> Optional[Tuple[str, int]]: # Parse JSON payload = json.loads(json_str) - # Validate structure + # Structural check — must be a JSON object (dict), not array/scalar if not isinstance(payload, dict): raise create_validation_error("Invalid cursor format: payload must be a JSON object") - if "captured_at" not in payload: - raise create_validation_error("Invalid cursor format: missing 'captured_at' field") - - if "id" not in payload: - raise create_validation_error("Invalid cursor format: missing 'id' field") - - captured_at = payload["captured_at"] - record_id = payload["id"] - - # Validate types - if not isinstance(captured_at, str): - raise create_validation_error("Invalid cursor format: 'captured_at' must be a string") - - if not isinstance(record_id, int): - raise create_validation_error("Invalid cursor format: 'id' must be an integer") - - return (captured_at, record_id) + # Validate fields via Pydantic (strict types, no extras) + cursor_data = CursorPayload.model_validate(payload) + return (cursor_data.captured_at, cursor_data.id) + except ValidationError as e: + raise _map_cursor_validation_error(e) from e except json.JSONDecodeError as e: raise create_validation_error(f"Invalid cursor format: malformed JSON - {str(e)}") from e except UnicodeDecodeError as e: diff --git a/mcp-server-python/utils/file_ops.py b/mcp-server-python/utils/file_ops.py index e1b7534..7ca6695 100644 --- a/mcp-server-python/utils/file_ops.py +++ b/mcp-server-python/utils/file_ops.py @@ -192,7 +192,7 @@ def resolve_write_action(file_exists: bool, force: bool) -> str: def materialize_resume_tex( - template_path: str = "data/templates/resume_skeleton_example.tex", + template_path: str = "data/templates/resume_skeleton.tex", target_path: Union[str, Path] = None, force: bool = False, ) -> str: @@ -224,7 +224,7 @@ def materialize_resume_tex( Examples: >>> # Create new resume.tex >>> action = materialize_resume_tex( - ... template_path="data/templates/resume_skeleton_example.tex", + ... template_path="data/templates/resume_skeleton.tex", ... target_path="data/applications/amazon-3629/resume/resume.tex", ... force=False ... ) diff --git a/mcp-server-python/utils/latex_guardrails.py b/mcp-server-python/utils/latex_guardrails.py index 2a49710..0c27a81 100644 --- a/mcp-server-python/utils/latex_guardrails.py +++ b/mcp-server-python/utils/latex_guardrails.py @@ -1,20 +1,18 @@ """ LaTeX placeholder scanner for resume quality guardrails. -This module provides utilities to scan LaTeX files for placeholder tokens -that indicate incomplete or draft content. Used by multiple tools to ensure -resume quality before finalization. +This module detects unfinished placeholder content in resume.tex files. +Per current project convention, placeholder labels are identified by the +stable marker substring: ``BULLET-POINT``. """ from pathlib import Path from typing import List, Tuple, Optional -# Placeholder tokens that must not be present in finalized resume.tex +# Single stable placeholder marker agreed by project convention. PLACEHOLDER_TOKENS = [ - "PROJECT-AI-", - "PROJECT-BE-", - "WORK-BULLET-POINT-", + "BULLET-POINT", ] @@ -37,7 +35,7 @@ def scan_tex_for_placeholders(tex_path: str) -> Tuple[bool, Optional[str], List[ Requirements: - 5.4: Scan resume.tex for placeholder tokens - 5.5: Block update when any guardrail check fails - - 5.6: Placeholder tokens include PROJECT-AI-, PROJECT-BE-, WORK-BULLET-POINT- + - 5.6: Placeholder tokens include BULLET-POINT Examples: >>> # Clean TEX file @@ -46,8 +44,8 @@ def scan_tex_for_placeholders(tex_path: str) -> Tuple[bool, Optional[str], List[ >>> # TEX with placeholders >>> scan_tex_for_placeholders("data/applications/draft/resume/resume.tex") - (False, 'resume.tex contains placeholder tokens: PROJECT-AI-, WORK-BULLET-POINT-', - ['PROJECT-AI-', 'WORK-BULLET-POINT-']) + (False, 'resume.tex contains placeholder tokens: BULLET-POINT', + ['BULLET-POINT']) """ tex_file = Path(tex_path) @@ -57,7 +55,7 @@ def scan_tex_for_placeholders(tex_path: str) -> Tuple[bool, Optional[str], List[ except (OSError, IOError) as e: return False, f"Failed to read resume.tex: {str(e)}", [] - # Search for placeholder tokens + # Search for placeholder marker. found_tokens = [] for token in PLACEHOLDER_TOKENS: if token in content: @@ -81,7 +79,7 @@ def get_placeholder_tokens() -> List[str]: Examples: >>> tokens = get_placeholder_tokens() - >>> "PROJECT-AI-" in tokens + >>> "BULLET-POINT" in tokens True """ return PLACEHOLDER_TOKENS.copy() diff --git a/mcp-server-python/utils/path_resolution.py b/mcp-server-python/utils/path_resolution.py index 7fb7ef0..3c362b4 100644 --- a/mcp-server-python/utils/path_resolution.py +++ b/mcp-server-python/utils/path_resolution.py @@ -11,6 +11,10 @@ from pathlib import Path from typing import Union +from config import config + +DEFAULT_DB_RELATIVE_PATH = Path("data/capture/jobs.db") + def get_repo_root() -> Path: """ @@ -45,5 +49,29 @@ def resolve_trackers_dir(trackers_dir: str | None) -> Path: Default directory is /trackers. """ if trackers_dir is None: - return get_repo_root() / "trackers" + return resolve_repo_relative_path(config.trackers_dir) return resolve_repo_relative_path(trackers_dir) + + +def resolve_db_path(db_path: str | None = None) -> Path: + """ + Resolve database path with consistent precedence across DB tools. + + Resolution order: + 1. Explicit `db_path` argument + 2. `JOBWORKFLOW_DB` + 3. `JOBWORKFLOW_ROOT/data/capture/jobs.db` + 4. `/data/capture/jobs.db` + """ + if db_path is not None: + return resolve_repo_relative_path(db_path) + + db_env = os.getenv("JOBWORKFLOW_DB") + if db_env: + return resolve_repo_relative_path(db_env) + + root_env = os.getenv("JOBWORKFLOW_ROOT") + if root_env: + return Path(root_env).expanduser().resolve() / DEFAULT_DB_RELATIVE_PATH + + return resolve_repo_relative_path(DEFAULT_DB_RELATIVE_PATH) diff --git a/mcp-server-python/utils/tracker_policy.py b/mcp-server-python/utils/tracker_policy.py index f20eb4c..69a94e9 100644 --- a/mcp-server-python/utils/tracker_policy.py +++ b/mcp-server-python/utils/tracker_policy.py @@ -8,15 +8,19 @@ - Force bypass with warning for policy violations """ -from typing import Dict, Any, Optional, List -from models.errors import create_validation_error +from typing import Any, Dict, List, Optional +from models.errors import create_validation_error +from models.status import JobTrackerStatus # Terminal statuses that can be reached from any current status -TERMINAL_STATUSES = {"Rejected", "Ghosted"} +TERMINAL_STATUSES = {JobTrackerStatus.REJECTED, JobTrackerStatus.GHOSTED} # Core forward transitions: current_status -> allowed_next_status -CORE_TRANSITIONS = {"Reviewed": "Resume Written", "Resume Written": "Applied"} +CORE_TRANSITIONS = { + JobTrackerStatus.REVIEWED: JobTrackerStatus.RESUME_WRITTEN, + JobTrackerStatus.RESUME_WRITTEN: JobTrackerStatus.APPLIED, +} class TransitionResult: @@ -132,10 +136,17 @@ def validate_transition( if target_status in TERMINAL_STATUSES: return TransitionResult(allowed=True, is_noop=False) + # Helper to get the plain string value from an Enum member or a raw string. + def _status_str(s): + return s.value if hasattr(s, "value") else s + + current_str = _status_str(current_status) + target_str = _status_str(target_status) + # Rule 4: Policy violation handling (Requirements 4.4, 4.5) error_msg = ( - f"Transition from '{current_status}' to '{target_status}' " - f"violates policy. Allowed transitions from '{current_status}': " + f"Transition from '{current_str}' to '{target_str}' " + f"violates policy. Allowed transitions from '{current_str}': " ) # Build list of allowed transitions for this status @@ -144,12 +155,14 @@ def validate_transition( allowed_transitions.append(CORE_TRANSITIONS[current_status]) allowed_transitions.extend(TERMINAL_STATUSES) - error_msg += ", ".join(f"'{s}'" for s in sorted(allowed_transitions)) + error_msg += ", ".join( + f"'{_status_str(s)}'" for s in sorted(allowed_transitions, key=_status_str) + ) if force: # Allow with warning (Requirement 4.5) warning = ( - f"Force bypass: Transition from '{current_status}' to '{target_status}' " + f"Force bypass: Transition from '{current_str}' to '{target_str}' " f"violates policy but was allowed due to force=true" ) return TransitionResult(allowed=True, is_noop=False, warnings=[warning]) diff --git a/mcp-server-python/utils/tracker_renderer.py b/mcp-server-python/utils/tracker_renderer.py index 0c73bb6..b91c91c 100644 --- a/mcp-server-python/utils/tracker_renderer.py +++ b/mcp-server-python/utils/tracker_renderer.py @@ -5,8 +5,10 @@ with required frontmatter fields and section structure. """ -from typing import Dict, Any +from typing import Any, Dict + import yaml +from models.status import JobTrackerStatus def render_tracker_markdown(job: Dict[str, Any], plan: Dict[str, Any]) -> str: @@ -76,7 +78,7 @@ def render_tracker_markdown(job: Dict[str, Any], plan: Dict[str, Any]) -> str: "job_id": job["job_id"], "company": job["company"], "position": job["title"], - "status": "Reviewed", # Initial tracker status + "status": JobTrackerStatus.REVIEWED.value, # Initial tracker status "application_date": application_date, "reference_link": job["url"], "resume_path": plan["resume_path"], diff --git a/mcp-server-python/utils/tracker_sync.py b/mcp-server-python/utils/tracker_sync.py index 7ef2548..09942a4 100644 --- a/mcp-server-python/utils/tracker_sync.py +++ b/mcp-server-python/utils/tracker_sync.py @@ -5,12 +5,12 @@ while preserving all other frontmatter fields and body content. """ -from pathlib import Path -from typing import Dict, Any import os import tempfile -import yaml +from pathlib import Path +from typing import Any, Dict +import yaml from utils.path_resolution import resolve_repo_relative_path @@ -60,7 +60,9 @@ def update_tracker_status(tracker_path: str, new_status: str) -> None: frontmatter, body = _extract_frontmatter_and_body(content) # Update only the status field (Requirement 7.1, 7.4) - frontmatter["status"] = new_status + # Convert Enum members to plain strings so PyYAML serializes them + # as scalars rather than tagged Python objects. + frontmatter["status"] = new_status.value if hasattr(new_status, "value") else new_status # Render updated tracker content updated_content = _render_tracker_content(frontmatter, body) diff --git a/mcp-server-python/utils/validation.py b/mcp-server-python/utils/validation.py index 9ee39d3..375e0b2 100644 --- a/mcp-server-python/utils/validation.py +++ b/mcp-server-python/utils/validation.py @@ -6,11 +6,13 @@ from datetime import datetime, timezone from typing import Optional, Tuple -from models.errors import create_validation_error +from config import config +from models.errors import create_validation_error +from models.status import JobDbStatus, JobTrackerStatus # Constants for validation -DEFAULT_LIMIT = 50 +DEFAULT_LIMIT = config.bulk_read_limit MIN_LIMIT = 1 MAX_LIMIT = 1000 @@ -19,20 +21,6 @@ INITIALIZE_MIN_LIMIT = 1 INITIALIZE_MAX_LIMIT = 200 -# Allowed status values for job status updates -ALLOWED_STATUSES = {"new", "shortlist", "reviewed", "reject", "resume_written", "applied"} - -# Allowed tracker status values for update_tracker_status tool -ALLOWED_TRACKER_STATUSES = { - "Reviewed", - "Resume Written", - "Applied", - "Interview", - "Offer", - "Rejected", - "Ghosted", -} - def validate_limit(limit: Optional[int]) -> int: """ @@ -194,9 +182,12 @@ def validate_status(status) -> str: ) # Check against allowed statuses (case-sensitive) - if status not in ALLOWED_STATUSES: + try: + JobDbStatus(status) + except ValueError: + allowed = ", ".join(sorted(s.value for s in JobDbStatus)) raise create_validation_error( - f"Invalid status value: '{status}'. Allowed values are: {', '.join(sorted(ALLOWED_STATUSES))}" + f"Invalid status value: '{status}'. Allowed values are: {allowed}" ) return status @@ -569,8 +560,12 @@ def validate_tracker_status(target_status) -> str: ) # Check against allowed tracker statuses (case-sensitive, Requirement 3.3) - if target_status not in ALLOWED_TRACKER_STATUSES: - allowed_list = ", ".join(f"'{s}'" for s in sorted(ALLOWED_TRACKER_STATUSES)) + try: + JobTrackerStatus(target_status) + except ValueError: + allowed_list = ", ".join( + f"'{s.value}'" for s in sorted(JobTrackerStatus, key=lambda s: s.value) + ) raise create_validation_error( f"Invalid target_status value: '{target_status}'. Allowed values are: {allowed_list}" ) @@ -853,28 +848,16 @@ def validate_finalize_resume_batch_parameters( # ============================================================================ # Constants for scrape_jobs validation -DEFAULT_SCRAPE_TERMS = ["ai engineer", "backend engineer", "machine learning"] -DEFAULT_SCRAPE_LOCATION = "Ontario, Canada" -DEFAULT_SCRAPE_SITES = ["linkedin"] -DEFAULT_RESULTS_WANTED = 20 MIN_RESULTS_WANTED = 1 MAX_RESULTS_WANTED = 200 -DEFAULT_HOURS_OLD = 2 MIN_HOURS_OLD = 1 MAX_HOURS_OLD = 168 -DEFAULT_REQUIRE_DESCRIPTION = True -DEFAULT_PREFLIGHT_HOST = "www.linkedin.com" -DEFAULT_RETRY_COUNT = 3 MIN_RETRY_COUNT = 1 MAX_RETRY_COUNT = 10 -DEFAULT_RETRY_SLEEP_SECONDS = 30 MIN_RETRY_SLEEP_SECONDS = 0 MAX_RETRY_SLEEP_SECONDS = 300 -DEFAULT_RETRY_BACKOFF = 2 MIN_RETRY_BACKOFF = 1 MAX_RETRY_BACKOFF = 10 -DEFAULT_SAVE_CAPTURE_JSON = True -DEFAULT_CAPTURE_DIR = "data/capture" def validate_scrape_terms(terms: Optional[list]) -> list: @@ -885,16 +868,16 @@ def validate_scrape_terms(terms: Optional[list]) -> list: terms: The search terms list (None for default) Returns: - Validated terms list (default: ["ai engineer", "backend engineer", "machine learning"]) + Validated terms list Raises: ToolError: If terms is invalid Requirements: 1.1, 1.2, 1.3 """ - # Use default if not provided + # Use default from config if not provided if terms is None: - return DEFAULT_SCRAPE_TERMS + return config.scrape_terms # Check type if not isinstance(terms, list): @@ -926,16 +909,16 @@ def validate_scrape_location(location: Optional[str]) -> str: location: The search location (None for default) Returns: - Validated location string (default: "Ontario, Canada") + Validated location string Raises: ToolError: If location is invalid Requirements: 1.1, 1.2 """ - # Use default if not provided + # Use default from config if not provided if location is None: - return DEFAULT_SCRAPE_LOCATION + return config.scrape_location # Check type if not isinstance(location, str): @@ -958,16 +941,16 @@ def validate_scrape_sites(sites: Optional[list]) -> list: sites: The source sites list (None for default) Returns: - Validated sites list (default: ["linkedin"]) + Validated sites list Raises: ToolError: If sites is invalid Requirements: 1.1, 1.2, 3.4 """ - # Use default if not provided + # Use default from config if not provided if sites is None: - return DEFAULT_SCRAPE_SITES + return config.scrape_sites # Check type if not isinstance(sites, list): @@ -999,16 +982,16 @@ def validate_results_wanted(results_wanted: Optional[int]) -> int: results_wanted: The requested scrape results per term (None for default) Returns: - Validated results_wanted value (default: 20, range: 1-200) + Validated results_wanted value Raises: ToolError: If results_wanted is invalid Requirements: 1.1, 1.4, 12.2 """ - # Use default if not provided + # Use default from config if not provided if results_wanted is None: - return DEFAULT_RESULTS_WANTED + return config.scrape_results_wanted # Check type (bool is a subclass of int in Python, reject explicitly) if isinstance(results_wanted, bool) or not isinstance(results_wanted, int): @@ -1038,16 +1021,16 @@ def validate_hours_old(hours_old: Optional[int]) -> int: hours_old: The recency window in hours (None for default) Returns: - Validated hours_old value (default: 2, range: 1-168) + Validated hours_old value Raises: ToolError: If hours_old is invalid Requirements: 1.1, 1.4, 12.2 """ - # Use default if not provided + # Use default from config if not provided if hours_old is None: - return DEFAULT_HOURS_OLD + return config.scrape_hours_old # Check type (bool is a subclass of int in Python, reject explicitly) if isinstance(hours_old, bool) or not isinstance(hours_old, int): @@ -1077,16 +1060,16 @@ def validate_require_description(require_description: Optional[bool]) -> bool: require_description: Whether to skip records without descriptions (None for default) Returns: - Validated require_description value (default: True) + Validated require_description value Raises: ToolError: If require_description is invalid Requirements: 1.1, 5.2 """ - # Use default if not provided + # Use default from config if not provided if require_description is None: - return DEFAULT_REQUIRE_DESCRIPTION + return config.scrape_require_description # Check type if not isinstance(require_description, bool): @@ -1105,16 +1088,16 @@ def validate_preflight_host(preflight_host: Optional[str]) -> str: preflight_host: The DNS preflight host (None for default) Returns: - Validated preflight_host string (default: "www.linkedin.com") + Validated preflight_host string Raises: ToolError: If preflight_host is invalid Requirements: 2.1, 12.2 """ - # Use default if not provided + # Use default from config if not provided if preflight_host is None: - return DEFAULT_PREFLIGHT_HOST + return config.scrape_preflight_host # Check type if not isinstance(preflight_host, str): @@ -1137,16 +1120,16 @@ def validate_retry_count(retry_count: Optional[int]) -> int: retry_count: The preflight retry count (None for default) Returns: - Validated retry_count value (default: 3, range: 1-10) + Validated retry_count value Raises: ToolError: If retry_count is invalid Requirements: 2.2, 12.2 """ - # Use default if not provided + # Use default from config if not provided if retry_count is None: - return DEFAULT_RETRY_COUNT + return config.scrape_retry_count # Check type (bool is a subclass of int in Python, reject explicitly) if isinstance(retry_count, bool) or not isinstance(retry_count, int): @@ -1176,16 +1159,16 @@ def validate_retry_sleep_seconds(retry_sleep_seconds: Optional[float]) -> float: retry_sleep_seconds: The base retry sleep seconds (None for default) Returns: - Validated retry_sleep_seconds value (default: 30, range: 0-300) + Validated retry_sleep_seconds value Raises: ToolError: If retry_sleep_seconds is invalid Requirements: 2.2, 12.2 """ - # Use default if not provided + # Use default from config if not provided if retry_sleep_seconds is None: - return DEFAULT_RETRY_SLEEP_SECONDS + return config.scrape_retry_sleep_seconds # Check type (bool is a subclass of int in Python, reject explicitly) # Accept both int and float @@ -1216,16 +1199,16 @@ def validate_retry_backoff(retry_backoff: Optional[float]) -> float: retry_backoff: The retry backoff multiplier (None for default) Returns: - Validated retry_backoff value (default: 2, range: 1-10) + Validated retry_backoff value Raises: ToolError: If retry_backoff is invalid Requirements: 2.2, 12.2 """ - # Use default if not provided + # Use default from config if not provided if retry_backoff is None: - return DEFAULT_RETRY_BACKOFF + return config.scrape_retry_backoff # Check type (bool is a subclass of int in Python, reject explicitly) # Accept both int and float @@ -1256,16 +1239,16 @@ def validate_save_capture_json(save_capture_json: Optional[bool]) -> bool: save_capture_json: Whether to persist per-term raw JSON capture files (None for default) Returns: - Validated save_capture_json value (default: True) + Validated save_capture_json value Raises: ToolError: If save_capture_json is invalid Requirements: 9.1, 9.2 """ - # Use default if not provided + # Use default from config if not provided if save_capture_json is None: - return DEFAULT_SAVE_CAPTURE_JSON + return config.scrape_save_capture_json # Check type if not isinstance(save_capture_json, bool): @@ -1284,16 +1267,16 @@ def validate_capture_dir(capture_dir: Optional[str]) -> str: capture_dir: The capture output directory (None for default) Returns: - Validated capture_dir string (default: "data/capture") + Validated capture_dir string Raises: ToolError: If capture_dir is invalid Requirements: 9.1 """ - # Use default if not provided + # Use default from config if not provided if capture_dir is None: - return DEFAULT_CAPTURE_DIR + return config.scrape_capture_dir # Check type if not isinstance(capture_dir, str): @@ -1325,7 +1308,7 @@ def validate_scrape_status(status: Optional[str]) -> str: """ # Use default if not provided if status is None: - return "new" + return JobDbStatus.NEW # Check type if not isinstance(status, str): @@ -1344,9 +1327,12 @@ def validate_scrape_status(status: Optional[str]) -> str: ) # Check against allowed statuses (case-sensitive) - if status not in ALLOWED_STATUSES: + try: + JobDbStatus(status) + except ValueError: + allowed = ", ".join(sorted(s.value for s in JobDbStatus)) raise create_validation_error( - f"Invalid status value: '{status}'. Allowed values are: {', '.join(sorted(ALLOWED_STATUSES))}" + f"Invalid status value: '{status}'. Allowed values are: {allowed}" ) return status diff --git a/pyproject.toml b/pyproject.toml index 3e737cd..f72dd65 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -20,6 +20,7 @@ dependencies = [ "requests>=2.32.5", "beautifulsoup4>=4.14.3", "markdownify>=0.13.1", + "python-dotenv>=1.0.0", ] [project.optional-dependencies] @@ -54,3 +55,10 @@ packages = ["mcp-server-python"] [build-system] requires = ["hatchling"] build-backend = "hatchling.build" + +[tool.pyrefly] +project-includes = ["mcp-server-python/**/*.py*"] +project-excludes = ["**/__pycache__/**", ".git/**", ".uv/**", "**/*venv/**", "**/.[!/.]*/**"] +search-path = ["mcp-server-python"] +python-interpreter-path = ".venv/bin/python" + diff --git a/skills/.markdownlint.json b/skills/.markdownlint.json new file mode 100644 index 0000000..19b74e6 --- /dev/null +++ b/skills/.markdownlint.json @@ -0,0 +1,5 @@ +{ + "MD013": false, + "MD036": false, + "MD060": false +} diff --git a/skills/career-tailor-finalize/SKILL.md b/skills/career-tailor-finalize/SKILL.md deleted file mode 100644 index a354a9c..0000000 --- a/skills/career-tailor-finalize/SKILL.md +++ /dev/null @@ -1,109 +0,0 @@ ---- -name: career-tailor-finalize -description: "Use when generating tailored resume artifacts from shortlist trackers and committing completion safely: run career_tailor, validate outputs, and finalize status sync." ---- - -# Skill: Career Tailor Finalize - -## Goal -Run the artifact and commit half of the pipeline: -1. Build per-job resume artifacts from trackers -2. Enforce resume quality/validity guardrails -3. Finalize DB + tracker sync only for successful items - -## MCP Tools In Scope -- `career_tailor` -- `finalize_resume_batch` -- `update_tracker_status` (fallback/manual correction only) - -Do not run ingestion or triage tools in this skill. - -## Inputs -- Shortlist tracker paths (and `job_db_id` when available) -- Optional path overrides for templates/applications root - -## Workflow -1. Build one `items[]` batch from shortlisted trackers. -2. Run `career_tailor` bootstrap pass to materialize per-item workspace files (`resume.tex`, `ai_context.md`, `resume.pdf` attempt). -3. Run an LLM fill pass on the materialized `resume.tex` files to replace placeholder bullets. -4. Run `career_tailor` again for compile/validation on edited files. -5. Use only second-pass `career_tailor.successful_items` for `finalize_resume_batch`. -6. Keep failed items in `shortlist`/`reviewed` with explicit reasons. -7. Use `update_tracker_status` only for targeted repair actions. - -## LLM Fill Pass (Required Between Two `career_tailor` Passes) -- Target files: `data/applications//resume/resume.tex` -- Context files: `data/templates/full_resume.md` and each tracker's Job Description -- Edit scope: bullet text only (Project Experience + Work Experience) -- Must replace all placeholders like `WORK-BULLET-POINT-*`, `PROJECT-AI-*`, `PROJECT-BE-*` -- Never change macros/sections/header/education/skills - -Use this execution prompt for each resume file: - -```text -You are filling LaTeX resume bullets. -Inputs: -- full resume facts: data/templates/full_resume.md -- target job description: from tracker markdown -- target tex file: data/applications//resume/resume.tex - -Rules: -1) Replace placeholder bullet tokens only with truthful content grounded in the full resume. -2) Keep every bullet in \\resumeItem{...} format. -3) Do not add new sections or macros. -4) Prefer impact + metric + stack phrasing. -5) No fabrication. - -Output: -- Apply direct edits to resume.tex. -- Ensure zero remaining placeholder tokens matching: - WORK-BULLET-POINT-|PROJECT-AI-|PROJECT-BE- -``` - -Preflight check before second `career_tailor` pass: - -```bash -files="$(find data/applications -type f -path '*/resume/resume.tex')" -if [ -z "$files" ]; then - echo "No resume.tex files yet. Run first-pass career_tailor bootstrap first." -else - echo "$files" | xargs rg -n "WORK-BULLET-POINT-|PROJECT-AI-|PROJECT-BE-" -fi -``` - -Expected: -- On first-time runs before bootstrap: informational message above. -- After bootstrap: no matches. - -## Tailoring Rules (Spirit Preserved) -- Source of truth content: `data/templates/full_resume.md` + tracker JD. -- Keep LaTeX structure intact; edit bullet text only. -- Focus edits on Project Experience and Work Experience. -- Prefer impact + metric + stack; no fabrication. -- Keep one-page resume target where possible. - -## LaTeX Safety -- Keep `\\resumeItem{...}` macro shape unchanged. -- Escape special characters when needed: `\\ % & _ # $ ~ ^ { }`. -- Use math mode for complexity notation (example: `$O(n^2)$`). - -## Guardrails (Hard) -- Never mark Resume Written/resume_written without valid artifacts. -- If `resume.pdf` missing/zero-byte or placeholders remain in `resume.tex`, do not finalize. -- Continue batch on item-level failures and return structured errors. - -## Manual Build Fallback (from legacy tex-build) -Only when explicit manual compile is needed: -```bash -latexmk -pdf -output-directory=data/templates data/templates/resume.tex -latexmk -c -output-directory=data/templates data/templates/resume.tex -``` -Use fallback as diagnostics; primary path remains `career_tailor`. - -## Required Output Shape -- `run_id` -- `tailor_totals`: total/success/failed -- `finalize_totals`: success/failed -- `failed_items` with concrete reason -- `errors_by_step` -- `next_actions` diff --git a/skills/career-tailor-finalize/agents/openai.yaml b/skills/career-tailor-finalize/agents/openai.yaml deleted file mode 100644 index e3564cb..0000000 --- a/skills/career-tailor-finalize/agents/openai.yaml +++ /dev/null @@ -1,4 +0,0 @@ -interface: - display_name: "Career Tailor Finalize" - short_description: "Generate resume artifacts and finalize status sync" - default_prompt: "Use $career-tailor-finalize to process shortlist trackers, generate resume artifacts, and finalize only successful items." diff --git a/skills/career-tailor-finalize/scripts/build_resume.sh b/skills/career-tailor-finalize/scripts/build_resume.sh deleted file mode 100755 index 20438a6..0000000 --- a/skills/career-tailor-finalize/scripts/build_resume.sh +++ /dev/null @@ -1,35 +0,0 @@ -#!/bin/bash -# scripts/build_resume.sh -# Usage: ./build_resume.sh - -COMPANY_SLUG=$1 -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -BASE_DIR="${JOBWORKFLOW_ROOT:-$(cd "$SCRIPT_DIR/../../.." && pwd)}" -TEMPLATE_PATH="$BASE_DIR/data/templates/resume_skeleton.tex" -TARGET_DIR="$BASE_DIR/data/applications/$COMPANY_SLUG/resume" -TARGET_TEX="$TARGET_DIR/resume.tex" -PDFLATEX="/Library/TeX/texbin/pdflatex" - -if [ -z "$COMPANY_SLUG" ]; then - echo "Wheek! Please provide a company slug (e.g. trajekt_sports)." - exit 1 -fi - -mkdir -p "$TARGET_DIR" -cp "$TEMPLATE_PATH" "$TARGET_TEX" - -echo "Wheek! Template copied to $TARGET_TEX. Ready for AI content filling." - -# Note: AI agent should use sed/edit to replace placeholders before calling compile. - -compile() { - cd "$TARGET_DIR" || exit 1 - "$PDFLATEX" -interaction=nonstopmode resume.tex - # Cleanup - rm -f resume.aux resume.log resume.out resume.synctex.gz -} - -if [[ "$2" == "--compile" ]]; then - compile - echo "Wheek! Compilation finished. Check $TARGET_DIR/resume.pdf" -fi diff --git a/skills/job-matching-expertise/SKILL.md b/skills/job-matching-expertise/SKILL.md new file mode 100644 index 0000000..7cb46a1 --- /dev/null +++ b/skills/job-matching-expertise/SKILL.md @@ -0,0 +1,149 @@ +--- +name: job-matching-expertise +description: "Domain expertise for evaluating job-candidate fit for Backend, ML, and AI Engineering roles. Provides rubrics, quality standards, and decision frameworks—no workflow orchestration." +--- + +# Skill: Job Matching Expertise + +## Purpose + +This skill encodes domain knowledge for assessing whether a job posting is a good match for a candidate focused on **Backend Engineering**, **Machine Learning**, and **AI Engineering** roles. It provides evaluation criteria, not execution steps. + +--- + +## Target Role Profile + +> **Source**: [`data/templates/full_resume.md`](../../data/templates/full_resume.md) + +**Summary**: Backend + ML/AI Engineering; Python-first; experienced with LLM integration, RAG pipelines, and production ML systems. + +--- + +## Job Quality Signals + +These signals are **secondary**. Use them only to decide between **Reviewed vs Reject** when the JD lacks clear must-have requirements or is too vague to assess. They must **not** override the hard-gate logic in `HR Initial Screen Standard`. + +**Strong signals ✅** + +- Specific responsibilities + specific stack/tools (not generic “exciting projects”) +- Clear problem space (what is being built, for whom, why it matters) +- Explicit “required/must-have” list (makes screening possible) +- Engineering fundamentals mentioned (testing, code review, docs, observability) +- Reasonable scope (not one person owning 15 unrelated domains) + +**Weak signals ⚠️** + +- Vague JD; no stack; no concrete deliverables +- No explicit must-have requirements (hard to screen; bias toward Reviewed) +- Skill laundry list / buzzword bingo without context +- Copy-paste corporate template; no team/product specifics + +--- + +## HR Initial Screen Standard + +This standard answers one question: **would an HR/recruiter likely filter this candidate out at initial screen, without assuming any upskilling?** + +### Policy Overrides (immediate ❌) + +- **Company blacklist**: Jerry, Alignerr, TATA +- Intern/internship roles + +### Hard Reject Rules (immediate ❌) + +Reject if **any** of the following is true. + +1) **Role function mismatch** + - Frontend-only (React/Vue/Angular/Next.js; UI work dominates) + - Mobile-only (iOS/Android/React Native) + - Pure DevOps/SRE where infra/oncall dominates and there is no clear backend/app ownership + - BI/Analyst-heavy “data” roles (dashboards/reporting/stakeholder insights as primary output) + +2) **Core stack mismatch (required-by-JD)** + - JD is strongly bound to Java/Spring, .NET, C++ as the core (Python/ML is absent or only “nice-to-have”) + - JD lists must-have technologies or domains that are not credibly evidenced in the resume + +3) **Seniority / credential hard mismatch** + - Staff/Principal/Lead role with clear expectations beyond current level (e.g., 10–12+ years + org-wide architecture + multi-team leadership) + - PhD **required** (not preferred) and the candidate does not have it + +4) **Eligibility constraints** + - Work authorization, location/onsite, security clearance, or language requirements explicitly marked as required and not met + +5) **Duplicate** + - Same company + title already processed in the current run + +### Evidence Standard (how to interpret “must-have”) + +- Treat JD items labeled **must/required** as hard gates. +- A requirement is “met” only if the resume contains **readable evidence** (role/project ownership + relevant responsibilities + outcomes/scale, where possible). +- Count a requirement as met only if the resume shows at least **two** of: (a) ownership/responsibility, (b) concrete project context, (c) production use, (d) measurable outcomes/scale. +- Familiarity/mentions without the above evidence should be treated as **not met** at HR screen. + +### Classification Output (aligned with batch labels) + +- **Shortlist**: passes HR initial screen (no policy override; no hard-reject triggered; all JD required items have resume evidence; seniority/eligibility aligned). +- **Reviewed**: cannot confidently decide due to missing/vague JD must-haves, or the resume evidence is weak/implicit but plausibly present. +- **Reject**: any policy override or hard-reject rule triggered, or any JD required item lacks resume evidence. + +Mapping note: `Shortlist ≈ Pass initial screen`, `Reviewed ≈ Unclear/Review`, `Reject ≈ Fail initial screen`. + +### Workflow Output Contract (Required) + +When emitting machine-consumable results (for MCP update steps), use this schema per job: + +```json +{ + "id": 12345, + "status": "shortlist", + "reason": "Meets Python backend + production ML requirements; no hard reject gates triggered." +} +``` + +Hard rules: + +- `status` must be exactly one of: `shortlist`, `reviewed`, `reject` (lowercase only). +- `reason` is mandatory and concise (1-2 sentences), tied to hard gates/evidence standard. +- If uncertain, use `reviewed` with explicit uncertainty reason; do not force binary decisions. +- Human-readable labels (`Shortlist/Reviewed/Reject`) are for explanation only, never for machine fields. + +### Notes on Equivalents (optional, conservative) + +Allow limited equivalence only when the JD does not state a strict tool requirement: + +- FastAPI or Flask/Django (Python API frameworks) +- AWS or GCP or Azure when the requirement is stated broadly as “cloud experience” (IaaS/PaaS). Do not treat them as equivalent when the JD requires a specific cloud provider or a specific managed service. +- Managed vector DBs (Pinecone/Weaviate) ↔ other managed vector DBs (e.g., Milvus managed) when the requirement is “vector DB” broadly +- Kubernetes ↔ other container schedulers only if JD is not explicit about K8s + +If the JD explicitly says “must have X in production,” do **not** substitute equivalents. + +--- + +## Batch Evaluation: Cognitive Biases & Calibration + +When evaluating multiple jobs in a batch, be aware of these cognitive risks: + +### Consistency Risks + +| Bias | Description | +|------|-------------| +| **Standards Drift** | Getting progressively more lenient after weak postings, or harsher after strong ones | +| **Comparison Bias** | Judging relative to batch ("better than last 5") instead of absolute rubric | +| **Fatigue Shortcuts** | Pattern matching on title/company alone; skimming JDs after 30+ jobs | + +### Distribution Calibration + +A healthy batch typically yields: + +| Category | Expected Range | +|----------|---------------| +| **Shortlist** | 15-25% | +| **Reviewed** | 20-30% | +| **Reject** | 45-65% | + +**Red flags:** + +- **>80% shortlist** → likely too lenient +- **>80% reject** → likely too harsh +- **<5% reviewed** → forcing binary decisions; embrace uncertainty diff --git a/skills/job-pipeline-intake/SKILL.md b/skills/job-pipeline-intake/SKILL.md deleted file mode 100644 index 8a1901d..0000000 --- a/skills/job-pipeline-intake/SKILL.md +++ /dev/null @@ -1,71 +0,0 @@ ---- -name: job-pipeline-intake -description: "Use when running intake and triage in JobWorkFlow: scrape jobs, read the new queue, classify shortlist/reviewed/reject, apply atomic status updates, and initialize shortlist trackers." ---- - -# Skill: Job Pipeline Intake - -## Goal -Run the intake half of the pipeline with deterministic state handling: -1. Ingest jobs into DB as `new` -2. Read `new` queue in batches -3. Classify each job -4. Write one atomic status update batch -5. Initialize trackers for `shortlist` - -DB status is SSOT; tracker files are projection. - -## MCP Tools In Scope -- `scrape_jobs` -- `bulk_read_new_jobs` -- `bulk_update_job_status` -- `initialize_shortlist_trackers` - -Do not run `career_tailor` or `finalize_resume_batch` in this skill. - -## Inputs -- Optional triage policy from user (preferred) -- Optional ingestion parameters (`terms`, `location`, `hours_old`, `results_wanted`) - -If no policy is provided, produce recommendation-only output and skip status writes. - -## Workflow -1. Run `scrape_jobs` (ingestion only). -2. Run `bulk_read_new_jobs(limit=50)`. -3. Classify each item into one of: - - `shortlist` - - `reviewed` - - `reject` -4. If classification is final, call `bulk_update_job_status` once with all updates. -5. Run `initialize_shortlist_trackers(limit=50, force=false, dry_run=false)`. - -## Triage Rubric (Keep This Tight) -Prioritize roles matching evidence in `data/templates/full_resume.md`: -- Strong fit: Python backend, FastAPI, LLM/RAG, knowledge graph, data infra, platform -- Medium fit: general backend/data roles with partial overlap -- Low fit: frontend-only, clearly junior-misaligned, or hard requirements not met - -Decision rule: -- `shortlist`: strong fit and valid job details -- `reviewed`: maybe fit, needs manual review, or incomplete confidence -- `reject`: clear non-fit or duplicate/low-quality posting - -## Guardrails -- Use one atomic write via `bulk_update_job_status`; avoid fragmented updates. -- Never change DB status when policy/confidence is missing. -- Treat tracker creation as projection only; DB is authoritative. -- Continue on partial failures and report per-step errors. - -## Required Output Shape -- `run_id` (if available) -- `scrape_totals`: fetched/cleaned/inserted/duplicate -- `triage_totals`: shortlist/reviewed/reject -- `tracker_totals`: created/skipped/failed -- `errors_by_step` -- `next_actions` - -## Optional Scouting Mode (Human-in-the-loop) -If user explicitly asks for active scouting patrol: -- Use the same quality bar as triage -- De-duplicate before creating tracker work -- Feed only high-signal jobs into this intake workflow diff --git a/skills/job-pipeline-intake/agents/openai.yaml b/skills/job-pipeline-intake/agents/openai.yaml deleted file mode 100644 index 65772eb..0000000 --- a/skills/job-pipeline-intake/agents/openai.yaml +++ /dev/null @@ -1,4 +0,0 @@ -interface: - display_name: "Job Pipeline Intake" - short_description: "Run intake triage and tracker initialization" - default_prompt: "Use $job-pipeline-intake to run scrape, triage recommendations or updates, and shortlist tracker initialization in one pass." diff --git a/skills/resume-crafting-expertise/SKILL.md b/skills/resume-crafting-expertise/SKILL.md new file mode 100644 index 0000000..4c659ff --- /dev/null +++ b/skills/resume-crafting-expertise/SKILL.md @@ -0,0 +1,242 @@ +# Skill: Resume Crafting Expertise + +## Purpose + +This skill encodes domain knowledge for creating effective technical resumes tailored to Backend, ML, and AI Engineering roles. It provides **content quality principles** and **strategic guidance**, not LaTeX syntax or execution steps. + +--- + +## Core Philosophy: Truth-Grounded Tailoring + +### The Golden Rule ✨ + +**Every word on the resume must be grounded in verifiable facts from this run's `ai_context.md` (`ai_context_path`).** + +- ✅ Reframe, re-emphasize, and reorganize true experiences +- ✅ Highlight different aspects based on target role +- ❌ Never fabricate projects, metrics, or technologies +- ❌ Never claim expertise in tools you haven't used + +### The Scope Boundary 🚧 + +**Template source of truth**: [`data/templates/resume_skeleton.tex`](../../data/templates/resume_skeleton.tex) + +**LLM may modify ONLY these fields (whitelist):** + +- Text inside `\resumeItem{...}` entries under: + - Project Experience + - Work Experience +- Placeholder content to replace: + - Any token matching `*-BULLET-POINT-*` in the active template + +**LLM must NOT modify (blacklist):** + +- Header/contact block (name, email, links, location, phone) +- Education section +- Technical Skills section +- Section titles/order +- Any LaTeX macros, package imports, spacing/styling commands +- `\resumeSubheading{...}` company/title/date/location metadata + +**Change granularity rule:** + +- Keep edits minimal and local: replace bullet content only. +- No structural edits unless user explicitly requests them. + +--- + +## Content Writing Principles + +### The Impact Formula + +Prefer this pattern when it helps clarity: **Action + Context + Impact + Tech Stack**. +Do not force all four elements into every bullet if it makes the sentence unnatural. + +Quick check: + +- ✅ "Built RAG pipeline for support chatbot; cut response time 4min -> 30sec at 50K+ queries/day using LangChain + Pinecone + GPT-4." +- ❌ "Developed chatbot using AI technologies." + +--- + +### Metrics Matter 📊 + +**Include quantifiable impact when available** + +**If no direct metrics available, use scope indicators** + +--- + +## Tailoring Strategy + +### Step 1: Resolve Runtime Inputs (Required) + +Primary runtime inputs come from MCP (`career_tailor`) result fields: + +- `resume_tex_path` (editing target) +- `ai_context_path` (single-run truth source) + +Rules: + +- Read and use `ai_context_path` as the factual source for this run. +- Edit only `resume_tex_path` within allowed bullet scope. +- If either path is missing or unreadable, **skip this job**. +- Record skip reason as: + - `missing_ai_context` + - `missing_resume_tex` + +### Step 2: Parse the Job Description + +Identify **3-5 key requirements** from the JD: + +- Required technologies (Python, FastAPI, LLMs, etc.) +- Problem domains (distributed systems, ML pipelines, data infra) + +### Step 2.5: Build a JD Anchor Set (Required) + +From responsibilities + qualifications, extract **5-8 anchor phrases** and tag: + +- `must_have`: core hiring signal +- `supporting`: helpful but non-blocking signal + +Example anchor set for backend/GenAI SDE roles: + +- distributed systems +- data pipelines/services +- GenAI/LLM application +- code quality/testing/reviews +- operational excellence (monitoring/automation) + +### Step 3: Map Your Experiences + +From `ai_context_path`, find experiences that demonstrate those requirements: + +- Which projects used similar tech? +- Which challenges match the problem domain? +- Which roles show appropriate seniority level? + +### Step 3.2: Score and Select Evidence (Required) + +Before writing bullets, rank candidate evidence from `ai_context_path`: + +- `+3` direct match to a `must_have` anchor +- `+2` quantifiable impact (latency, throughput, %, scale) +- `+1` end-to-end ownership (designed + built + operated) +- `-2` technically strong but weak JD relevance + +Selection rules: + +- Pick highest-ranked evidence first (do not start from section order). +- Include evidence for top `must_have` anchors whenever the source facts exist. +- Do not omit direct-match evidence in favor of niche but less relevant technical details. + +### Step 3.5: Section-Scoped Grounding (Hard Constraint) + +Apply facts only to the correct section ownership: + +- `Project Experience` bullets must come from project facts in `ai_context_path`. +- `Work Experience` internship/job bullets must come from that work experience (or another real work role), not from unrelated projects. +- `Work Experience` research bullets must come from research facts, not from unrelated internships/projects. + +Do not move achievements across sections just to optimize keywords. Tailoring is allowed; cross-section fact reassignment is not. + +### Step 4: Adjust Emphasis + +**For high-relevance experiences:** + +- Lead with them (top bullets in each section) +- Add more technical detail +- Highlight metrics related to JD priorities + +**For medium-relevance experiences:** + +- Include but keep concise +- Frame in terms that bridge to target role +- Focus on transferable skills + +**For low-relevance experiences:** + +- Minimize or omit (if space-constrained) +- Reframe if possible (e.g., "frontend work" → "full-stack experience") + +### Step 4.5: Section Budget (Required) + +Use a relevance budget so one section does not dominate: + +- Project bullets: primary carrier of JD anchors +- Industry/internship bullets: secondary carrier (real production ownership) +- Research bullets: keep only what strengthens JD anchors (performance/reliability/observability) + +If template has fixed section slots, still apply this rule by: + +- prioritizing strongest anchor evidence at the top of each section +- avoiding low-relevance research detail when JD needs backend/system execution signals + +### Step 5: Keyword Alignment + +**Naturally incorporate JD keywords** without keyword stuffing: + +If JD mentions "RAG pipelines", and you built semantic search: + +- ✅ "Implemented RAG pipeline for semantic search using..." +- ❌ Force "RAG" into every bullet + +For platform-specific keywords (example: AWS vs GCP vs AZURE): + +- ✅ Whitelist-equivalent cloud mapping is allowed at capability level: + - AWS <-> GCP <-> Azure for generic cloud requirements (compute, storage, IAM, monitoring, CI/CD). +- ✅ Keep the actual platform truth explicit in bullets (e.g., "on GCP"). +- ❌ Do not claim production experience on a platform not present in `ai_context.md`. + +--- + +## Supported Prompt Intents + +Absorb and execute these prompt intents within current scope (bullet-level only): + +- Bullet rewrite to measurable accomplishments (results-oriented, strong verbs, metrics when available). +- ATS keyword optimization based on JD, with natural phrasing for human readability. +- Work-history language alignment to target JD skills/qualifications (without fabricating experience). +- Transferable-skills reframing for career transitions, limited to existing experience in `ai_context.md`. +- Resume audit feedback focused on vagueness, wordiness, impact, leadership/results signals in bullet content. +- Hiring-manager style critique focused on what to tighten, cut, or emphasize in bullets for interview likelihood. + +Out of scope for this skill: + +- Resume summary rewrite +- Headline/subheadline writing +- Layout/format redesign +- Technical Skills section rewrite + +--- + +## Quality Control Guardrails + +### Final Gate Checklist + +Before marking a resume as ready: + +- [ ] Truthfulness: every claim is grounded in this run's `ai_context.md`, with no fabricated projects/metrics/platform experience. +- [ ] Section mapping: each bullet's fact source matches its section ownership (project vs work vs research). +- [ ] Scope discipline: only allowed `\resumeItem{}` bullet content was edited; no section/header/macro/style changes. +- [ ] No duplication: no repeated or near-duplicate bullets within the same `resume.tex`. +- [ ] Placeholder cleanup: no placeholder tokens remain (`*-BULLET-POINT-*`, `TODO`, `[Description goes here]`). +- [ ] JD alignment and readability: top bullets align to key JD requirements, wording is specific and natural (no keyword stuffing). +- [ ] Anchor coverage: core `must_have` JD anchors are represented when source evidence exists. +- [ ] Evidence priority: highest-ranked source evidence (from Step 3.2) is included, not displaced by weaker-fit content. +- [ ] Lexical coverage: at least 4 JD anchor phrases are reflected naturally in bullet wording. + +### Machine-Check Commands (Required Before Compile) + +Use `resume_tex_path` from runtime inputs. + +```bash +# 1) Placeholder scan: must return no lines +grep -n -E 'BULLET-POINT|TODO|\[Description goes here\]' "$resume_tex_path" + +# 2) Duplicate bullet scan: must return no lines +sed -n 's/^[[:space:]]*\\resumeItem{\(.*\)}[[:space:]]*$/\1/p' "$resume_tex_path" \ + | tr '[:upper:]' '[:lower:]' \ + | sed -E 's/[[:space:]]+/ /g; s/^ //; s/ $//' \ + | sort | uniq -d +``` diff --git a/trackers/Job Application.md b/trackers/job-application.md similarity index 100% rename from trackers/Job Application.md rename to trackers/job-application.md diff --git a/uv.lock b/uv.lock index 19348ad..edc2bd7 100644 --- a/uv.lock +++ b/uv.lock @@ -413,6 +413,7 @@ dependencies = [ { name = "pandas" }, { name = "pydantic" }, { name = "python-dateutil" }, + { name = "python-dotenv" }, { name = "python-frontmatter" }, { name = "python-jobspy" }, { name = "pyyaml" }, @@ -450,6 +451,7 @@ requires-dist = [ { name = "pytest", marker = "extra == 'dev'", specifier = ">=8.0.0" }, { name = "pytest-asyncio", marker = "extra == 'dev'", specifier = ">=0.23.0" }, { name = "python-dateutil", specifier = ">=2.9.0" }, + { name = "python-dotenv", specifier = ">=1.0.0" }, { name = "python-frontmatter", specifier = ">=1.1.0" }, { name = "python-jobspy", specifier = ">=1.1.82" }, { name = "pyyaml", specifier = ">=6.0.3" },