feat(security): integrate llm-sanitizer for prompt-injection detection by warnes · Pull Request #100 · Warnes-Innovations/cv-builder

warnes · 2026-04-20T23:09:24Z

Summary

Replaces the bespoke 11-pattern substring list in cv_orchestrator.py with a centralised prompt_safety module backed by the llm-sanitizer library.

What changed

`scripts/utils/prompt_safety.py` (new)

Centralised security module exposing three public functions:

Function	Purpose
`scan_text_for_injection(text, min_risk)`	Boolean check; union of llm-sanitizer rule scan + supplementary substring check for cv-builder-specific phrases in DOM text fragments
`sanitize_instruction_text(text)`	Strips injection content via `redact()` then word-boundary regex pass; returns `(cleaned_text, findings)` in legacy cv-builder dict format
`scan_for_safety_alert(text, sensitivity)`	Warn-only scanner for job-description ingestion; returns alert dict or `None`

Detection coverage added over the old implementation:

✅ Zero-width characters (U+200B etc.)
✅ Base64-encoded payloads
✅ Homoglyph substitution attacks
✅ Data exfiltration patterns
✅ Role-play / jailbreak phrases
✅ XML/chat-delimiter system prompt markers

`scripts/utils/cv_orchestrator.py`

Removes _LAYOUT_AGENT_INSTRUCTION_PATTERNS constant
_sanitize_layout_instruction_text: delegates to sanitize_instruction_text() — return shape unchanged
_sanitize_layout_context_html, _sanitize_layout_instruction_html: replaces any(pattern in text.lower() for pattern in ...) with scan_text_for_injection(); all BeautifulSoup DOM manipulation is unchanged

`scripts/routes/job_routes.py`

Three job-description ingestion routes (submit_job, fetch_job_url, load_job_file) now scan the received text via scan_for_safety_alert():

Indicators are logged as warnings (not blocking — job descriptions are user-selected content)
safety_alert dict is included in the JSON response when findings are present

`tests/test_prompt_safety.py` (new)

21 unit tests covering all three public functions, including:

Detection of classic injection phrases
Zero-width character detection (new vector)
Clean text correctly returns no findings
Fragment field truncation to ≤ 500 chars

Test results

343 existing Python tests: all pass
21 new tests in test_prompt_safety.py: all pass
22 layout instruction tests: all pass

Notes

llm-sanitizer must be installed in the cvgen conda environment before running the app or tests:

/usr/local/Caskroom/miniconda/base/envs/cvgen/bin/pip install -e /path/to/llm-sanitizer

The library is at Warnes-Innovations/llm-sanitizer (local sibling repo; AGPL-3.0).

Replace the bespoke 11-pattern substring list in cv_orchestrator.py with a centralised prompt_safety module backed by the llm-sanitizer library. Changes ------- scripts/utils/prompt_safety.py (new) - Lazy Scanner singleton (thread-safe module-level initialisation). - scan_text_for_injection(text, min_risk) — union of llm-sanitizer rule scan and supplementary substring check; covers zero-width chars, base64 payloads, homoglyphs, data-exfil patterns AND cv-builder-specific phrases (system prompt, agent instruction, etc.) that must be detected in plain- text DOM fragments (comment/hidden-element text) where surrounding HTML syntax is absent. - sanitize_instruction_text(text) — strips injection content via llm-sanitizer redact() then a word-boundary regex pass; returns (cleaned_text, findings) in the legacy cv-builder dict format. - scan_for_safety_ale - scan_for_safety_ale - scan_for_safety_ale - scan_for_safety_ale - scan_for_safety_ale - scan_for_safety_ale - sls/cv_orchestrator.py - Remove _LAYOUT_AGENT_INSTRUCTION_PATTERNS constant. - _sanitize_layout_instruction_text: replace manual regex loop with sanitize_instruction_text(); return shape unchanged. - _sanitize_layout_context_html, _sanitize_layout_instruction_htm - _sanitize_layout_context_html, _sanitize_layout_instruction_htm - _sa s/r s/y. s text is extracted; log a warning if indicators found; include safety_alert in the JSON respons safety_alert in the JSON respons safety_alert in the JSON y.py (new) - 21 unit tests covering scan_text_for_injection, sanitize_ - 21 unit tests covering scan_text_for_injection, sanitize_ - dth-char detection test. All 343 existing Python tests pass; 21 new tests added and passing.

- add __env to repository ignore rules - prevent accidental commits of local API key environment file - keep environment guidance in workspace while excluding sensitive content

The llm-sanitizer package is installed locally via `pip install -e` but is not yet published to PyPI and cannot be added to scripts/requirements.txt. CI was failing with an ImportError at collection time because `scripts/utils/prompt_safety.py` imported directly from `llm_sanitizer.*`. Changes: - Wrap the three `llm_sanitizer` imports in a try/except ImportError block and set `_HAS_LLM_SANITIZER = True/False`. - Guard the scanner-based passes in `scan_text_for_injection`, `sanitize_instruction_text`, and `scan_for_safety_alert` so they short-circuit gracefully when the library is absent. - The fast substring pass in `scan_text_for_injection` always runs, preserving injection detection for the most common cv-builder cases without the library. - `min_risk` parameter type changed from `RiskLevel` (import-time default) to `Any = None`, resolved to `RiskLevel.high` at call time when the library is available. Local test: 306 passed (CI-equivalent command).

- Correct persona tally consistency in the rollup - Add technical persona gap synthesis and updated totals - Normalize markdown formatting for clean lint diagnostics

…date requirements - Remove optional import logic and guards from prompt_safety.py - Restore direct imports for RiskLevel, ScanResult, redact, Scanner - Add llm-sanitizer>=0.1.0 to requirements.txt for CI and local installs - Update UI browser restore test for new layout review state - Minor workspace config tweak for uv All Python tests pass (306/306) in cvgen environment.

…V tab, and first-run onboarding - GAP-49 (spell-check confirm gate): add _confirmProceedToGenerate() modal before CV generation in both empty and full spell-check submit paths; shows current ATS score and staleness warning before proceeding - GAP-30 (cover letter opening style): add formal/hook/narrative opening style selector to cover letter form; _OPENING_GUIDANCE dict in backend drives per-style LLM prompt; opening_style persisted in session state - GAP-41 (Master CV tab pre-job): expose 'master' tab in STAGE_TABS.job so Master CV editor is accessible before job description is entered - GAP-36 (first-run onboarding): session-free /api/setup/master-cv-status and /api/setup/create-master-cv endpoints; showOnboardingModal() / onboardingCreateEmptyProfile() JS functions; onboarding modal HTML in index.html; 7 new JS tests covering all onboarding paths - fix(tests/ui): change Playwright goto wait_until from networkidle to load across all 8 conftest fixtures to eliminate a pre-existing flaky timeout; _wait_for_ui_ready() already ensures JS readiness

Enhance the onboarding modal (introduced in GAP-36) into a general welcome/orientation screen shown on every startup until explicitly dismissed. Both variants now share a complete workflow overview. Changes: - web/index.html: restructure onboarding modal body with shared 3-step workflow section (colored numbered badges: build master profile -> target job -> harvest improvements) and variant-specific CTA banners (green for present, amber for missing); "Don't show again" checkbox on present footer - web/session-manager.js: add _setWelcomeSection(), maybeShowWelcomeModal() (checks localStorage dismissal flag, queries /api/setup/master-cv-status, shows appropriate section), and closeWelcomeModal() (hides overlay, optionally persists dismissal flag); update showOnboardingModal(); export 2 new symbols - web/app.js: call maybeShowWelcomeModal() in init() after ensureSessionContext - web/bundle.js: regenerated (2717.9 KB) - tests/js/session-manager.test.js: update DOM fixtures; update showOnboardingModal test to assert section visibility; add 7 new tests covering maybeShowWelcomeModal and closeWelcomeModal; fix localStorage cleanup with try/catch Closes GAP-37

master_data_routes.py omitted certifications from the /api/master-data/full response, causing the Certifications section in the Master CV editor to always render empty despite correct data in Master_CV_Data.json. Changes: - scripts/routes/master_data_routes.py: add "certifications": master.get('certifications', []) to master_data_full() response payload Closes GAP-42

Replace the non-functional undoInstruction() stub with snapshot-based undo. Before each layout instruction is applied, the current HTML and instruction list are pushed onto an in-memory undo stack (capped at 20 entries). When Undo is pressed, the last snapshot is popped and restored: preview iframe, stateManager CV artifacts, and instruction history are all rolled back. Changes: - web/layout-instruction.js: add module-level _layoutUndoStack / _UNDO_STACK_MAX; push snapshot in submitLayoutInstruction() before API call; replace stub undoInstruction() with stack-pop restore logic - web/bundle.js: regenerated (2719.1 KB) - tests/js/layout-instruction.test.js: add submitLayoutInstruction to imports; rewrite undoInstruction describe block with 4 tests covering empty-stack no-op, snapshot restore of HTML and instructions, system message, and absent-layoutInstructions guard Closes GAP-25

Resolves the 79-record ambiguous set from the initial /specstoryOrganize pass (55 unique files after cross-workspace duplicates were identified). Changes: - Deleted 21 empty/stub sessions (untitled sessions, "File editing session" placeholders, and near-empty sessions under 300B) - Moved 8 personal-research sessions (diabetic neuropathy knowledge system work) to ~/CV/llm-history/ instead of a code repo - Resolved 23 cross-workspace duplicate groups: kept the canonical copy in the correct repo, deleted or moved the duplicates accordingly - Routed 3 single-copy files to their correct repos (vscode-config) - Annotated all kept copies with specstory-relocated metadata

…ons"; GAP-29 venue warning GAP-28: cv-template.html conditionally rendered "Publications" when no count suffix was needed. Heading now always reads "Selected Publications"; the count suffix (n) is appended only when some publications were omitted. GAP-29: _format_publications() in cv_orchestrator.py never set venue_warning, so the .pub-venue-warn icon in the template was never triggered. Now sets venue_warning to a descriptive message for any entry missing both journal and booktitle fields; entries with a venue continue to receive an empty string. Changes: - templates/cv-template.html: always emit "Selected Publications"; move count suffix outside the conditional so it only appears when pubs were omitted - scripts/utils/cv_orchestrator.py: add venue_warning field to every entry in _format_publications() based on presence of journal or booktitle Closes GAP-28 Closes GAP-29

…ublications" for all Update cv-template.html and requirements to reflect the correct heading rule: - "Selected Publications" when a subset of accepted publications is shown - "Publications" when all accepted publications are shown - Count suffix (N) never appears in generated documents Also fix stale evidence in review-status/hiring-manager.md where two rows still described venue_warning as unimplemented (fixed by GAP-29, ad9edf0). Changes: - templates/cv-template.html: conditional heading in both HTML section and ATS plain-text block - tasks/user-story-hiring-manager.md: US-M7 item 5 and acceptance criteria updated to specify the subset/full distinction and prohibit count suffix - tasks/gaps.md: GAP-24 description and GAP-28 status updated; count context removed from recommended resolution - tasks/ui-review.md: GAP-28 finding marked CLOSED - tasks/review-status/hiring-manager.md: heading and venue-warning rows updated to Pass/Partial with accurate evidence Amends behavior introduced in ad9edf0.

…compute time Previously, ats_score (from POST /api/cv/ats-score) and validation_results (from GET /api/ats-validate) were only written to metadata.json during POST /api/finalise. If finalise was never called, those values were lost. Fix: add a _try_patch_metadata() helper to both generation_routes.py and review_routes.py. After each value is computed and stored in session state, the helper immediately patches it into the session's metadata.json file. The helper is fire-and-forget — exceptions are logged as warnings but never propagated, so no existing error paths change. Affected routes: - POST /api/cv/ats-score → patches {"ats_score": score} into metadata.json - GET /api/ats-validate → patches {"validation_results": {...}} into metadata.json

showAlertModal and closeAlertModal were defined in both ui-core.js and ui-helpers.js. The ui-core.js versions targeted non-existent element IDs (alert-modal, alert-title, alert-message) and created a second hidden overlay dynamically, making them effectively dead code. The ui-helpers.js versions are canonical: they target the correct HTML elements (alert-modal-overlay, alert-modal-title, alert-modal-message) and integrate with the focus-trap helpers (trapFocus, restoreFocus, setInitialFocus) exported by ui-core.js. Changes: - Remove showAlertModal + closeAlertModal function bodies and export from ui-core.js - Remove stale "also defined in ui-core.js" comment from ui-helpers.js - Rebuild web/bundle.js

…cknowledgement Require the user to expand the persuasion warnings panel and click "Acknowledged" before the "Submit All Decisions" button becomes enabled and before submitRewriteDecisions() will proceed. Changes: - updateRewriteTally(): disable submit button when persuasionWarningsAcknowledged is false (in addition to the existing pending > 0 check) - Acknowledged button onclick: call updateRewriteTally() after setting the flag so the submit button re-evaluates immediately - submitRewriteDecisions(): hard guard — shows an alert and returns early if warnings have not been acknowledged (defence in depth) - Add setPersuasionWarningsAcknowledged() setter for testability - Update rewrite-review.test.js: set acknowledged=true in the "enables submit button when no pending cards remain" test - Rebuild web/bundle.js

… Review tab Backend: - POST /api/cover-letter/save now appends the generated cover letter filename to generated_files['files'] in session state so it is registered for download and visible to the frontend. - POST /api/screening/save does the same for the screening responses DOCX. Frontend (download-tab.js / bundle.js): - _collectDownloadableFiles() recognises CoverLetter_* and Screening_Responses_* filename prefixes and renders them with appropriate descriptions instead of the generic DOCX label. Both artefacts are now served by the existing /api/download/<filename> route via the generated_files['files'] list lookup.

…abels SESSION_PHASE_LABELS in utils.js now maps every backend phase enum to a human-readable title-case string (e.g. 'rewrite_review' → 'Rewrite Review', 'init' → 'Getting Started') rather than lowercase abbreviations. session-manager.js: loadSessionFile restore message now reads from SESSION_PHASE_LABELS via an inline lookup instead of embedding the raw enum value (e.g. '(rewrite_review)' → '(Rewrite Review)'). Tests updated to match: - session-manager.test.js: restore-message expectation already set to '(Rewrite Review)' — now passes. - session-switcher.test.js: formatSessionPhaseLabel expectations updated from short abbreviated values to full canonical labels ('Rewrite Review', 'Layout Review'). 1108/1108 JS tests pass.

On the first analyzeJob() call, append a system message informing the user that submitted content is sent to the configured LLM provider. The flag is persisted in localStorage (StorageKeys.LLM_DISCLOSURE_SHOWN) so the message shows only once per browser profile. Changes: - api-client.js: add LLM_DISCLOSURE_SHOWN key to StorageKeys. - job-analysis.js: import StorageKeys; check+set flag at top of analyzeJob() before any fetch. - tests/js/job-analysis.test.js: two new tests — shows disclosure on first call; suppresses it when flag already set. - tests/js/api-client.test.js: update key-count assertion 5 → 6 and add assertion for new LLM_DISCLOSURE_SHOWN key. 1111/1111 JS tests pass.

…ter_data_routes._save_master The _save_master helper in master_data_routes.py now mirrors the behaviour already present in web_app.py: 1. Creates a timestamped backup before writing (unchanged). 2. Calls validate_master_data() on the in-memory dict immediately after the write. 3. If validation fails, restores the backup and raises ValueError so the calling route can surface the error rather than silently persisting invalid data. 91 Python master-data tests pass.

session-switcher-ui.js: - Session row button text: 'Delete' → 'Move to Trash' - Button title attribute: 'Delete session' → 'Move session to Trash' - Saved-sessions section note: 'rename, or delete saved work' → 'rename, or move saved work to Trash' Permanent-deletion labels in the Trash view ('Delete Forever', 'Permanently delete…') are intentionally unchanged — they accurately describe an irreversible action on already-trashed items. 1111/1111 JS tests pass.

…trol layout-instruction.js: - Add `pxToPt(px)` helper (exported): converts CSS px to typographic pt using 96 px/in × 72 pt/in convention (1px = 0.75pt), rounded to 1dp - Rename label from 'Base font size (px):' to 'Base font size:' — unit now appears inline in the adjacent pt-display span - Add `<span id="font-size-pt-display">` next to the input showing the current value as 'Npx (M.m pt)' - Wire `input` event on base-font-size-input to update the span live - On session-state restore, update the pt span alongside the input value - Default display: 13 px (9.8 pt) tests/js/layout-instruction.test.js: - Import pxToPt - Add 5-case `pxToPt` describe block covering default value, round numbers, minimum, and rounding-to-1dp behaviour 1116/1116 JS tests pass.

master_data_routes.py (cover_letter_generate): - Prompt instruction changed from '~300–400 words' to '~250–300 words' The front-end validation range (250–400) is intentionally left unchanged in this change — see GAP-HM-04 for planned role-differentiated validation. 40/40 cover-letter Python tests pass.

cv-builder.code-workspace: - Add ../linkedown as a new sibling-repo workspace folder entry

warnes added 23 commits April 20, 2026 19:08

chore(gitignore): ignore __env local secrets template

3b1b17f

- add __env to repository ignore rules - prevent accidental commits of local API key environment file - keep environment guidance in workspace while excluding sensitive content

docs(review): refresh 17-persona UI review rollup

53e131a

- Correct persona tally consistency in the rollup - Add technical persona gap synthesis and updated totals - Normalize markdown formatting for clean lint diagnostics

chore(workspace): add linkedown as workspace folder

a5d47c1

cv-builder.code-workspace: - Add ../linkedown as a new sibling-repo workspace folder entry

warnes merged commit 4d638d5 into devel Apr 22, 2026
8 of 11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(security): integrate llm-sanitizer for prompt-injection detection#100

feat(security): integrate llm-sanitizer for prompt-injection detection#100
warnes merged 23 commits intodevelfrom
feat/llm-sanitizer-integration

warnes commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

warnes commented Apr 20, 2026

Summary

What changed

scripts/utils/prompt_safety.py (new)

scripts/utils/cv_orchestrator.py

scripts/routes/job_routes.py

tests/test_prompt_safety.py (new)

Test results

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`scripts/utils/prompt_safety.py` (new)

`scripts/utils/cv_orchestrator.py`

`scripts/routes/job_routes.py`

`tests/test_prompt_safety.py` (new)