Skip to content

feat(security): integrate llm-sanitizer for prompt-injection detection#100

Merged
warnes merged 23 commits intodevelfrom
feat/llm-sanitizer-integration
Apr 22, 2026
Merged

feat(security): integrate llm-sanitizer for prompt-injection detection#100
warnes merged 23 commits intodevelfrom
feat/llm-sanitizer-integration

Conversation

@warnes
Copy link
Copy Markdown
Contributor

@warnes warnes commented Apr 20, 2026

Summary

Replaces the bespoke 11-pattern substring list in cv_orchestrator.py with a centralised prompt_safety module backed by the llm-sanitizer library.

What changed

scripts/utils/prompt_safety.py (new)

Centralised security module exposing three public functions:

Function Purpose
scan_text_for_injection(text, min_risk) Boolean check; union of llm-sanitizer rule scan + supplementary substring check for cv-builder-specific phrases in DOM text fragments
sanitize_instruction_text(text) Strips injection content via redact() then word-boundary regex pass; returns (cleaned_text, findings) in legacy cv-builder dict format
scan_for_safety_alert(text, sensitivity) Warn-only scanner for job-description ingestion; returns alert dict or None

Detection coverage added over the old implementation:

  • ✅ Zero-width characters (U+200B etc.)
  • ✅ Base64-encoded payloads
  • ✅ Homoglyph substitution attacks
  • ✅ Data exfiltration patterns
  • ✅ Role-play / jailbreak phrases
  • ✅ XML/chat-delimiter system prompt markers

scripts/utils/cv_orchestrator.py

  • Removes _LAYOUT_AGENT_INSTRUCTION_PATTERNS constant
  • _sanitize_layout_instruction_text: delegates to sanitize_instruction_text() — return shape unchanged
  • _sanitize_layout_context_html, _sanitize_layout_instruction_html: replaces any(pattern in text.lower() for pattern in ...) with scan_text_for_injection(); all BeautifulSoup DOM manipulation is unchanged

scripts/routes/job_routes.py

Three job-description ingestion routes (submit_job, fetch_job_url, load_job_file) now scan the received text via scan_for_safety_alert():

  • Indicators are logged as warnings (not blocking — job descriptions are user-selected content)
  • safety_alert dict is included in the JSON response when findings are present

tests/test_prompt_safety.py (new)

21 unit tests covering all three public functions, including:

  • Detection of classic injection phrases
  • Zero-width character detection (new vector)
  • Clean text correctly returns no findings
  • Fragment field truncation to ≤ 500 chars

Test results

343 existing Python tests: all pass
21 new tests in test_prompt_safety.py: all pass
22 layout instruction tests: all pass

Notes

llm-sanitizer must be installed in the cvgen conda environment before running the app or tests:

/usr/local/Caskroom/miniconda/base/envs/cvgen/bin/pip install -e /path/to/llm-sanitizer

The library is at Warnes-Innovations/llm-sanitizer (local sibling repo; AGPL-3.0).

warnes added 23 commits April 20, 2026 19:08
Replace the bespoke 11-pattern substring list in cv_orchestrator.py with
a centralised prompt_safety module backed by the llm-sanitizer library.

Changes
-------
scripts/utils/prompt_safety.py (new)
  - Lazy Scanner singleton (thread-safe module-level initialisation).
  - scan_text_for_injection(text, min_risk) — union of llm-sanitizer rule
    scan and supplementary substring check; covers zero-width chars, base64
    payloads, homoglyphs, data-exfil patterns AND cv-builder-specific phrases
    (system prompt, agent instruction, etc.) that must be detected in plain-
    text DOM fragments (comment/hidden-element text) where surrounding HTML
    syntax is absent.
  - sanitize_instruction_text(text) — strips injection content via
    llm-sanitizer redact() then a word-boundary regex pass; returns
    (cleaned_text, findings) in the legacy cv-builder dict format.
  - scan_for_safety_ale  - scan_for_safety_ale  - scan_for_safety_ale  - scan_for_safety_ale  - scan_for_safety_ale  - scan_for_safety_ale  - sls/cv_orchestrator.py
  - Remove _LAYOUT_AGENT_INSTRUCTION_PATTERNS constant.
  - _sanitize_layout_instruction_text: replace manual regex loop with
    sanitize_instruction_text(); return shape unchanged.
  - _sanitize_layout_context_html, _sanitize_layout_instruction_htm  - _sanitize_layout_context_html, _sanitize_layout_instruction_htm  - _sa
                                                                          s/r                                                                          s/y.                                                                          s  text is extracted; log a warning if indicators found; include
    safety_alert in the JSON respons    safety_alert in the JSON respons    safety_alert in the JSON y.py (new)
  - 21 unit tests covering scan_text_for_injection, sanitize_  - 21 unit tests covering scan_text_for_injection, sanitize_  - dth-char detection test.

All 343 existing Python tests pass; 21 new tests added and passing.
- add __env to repository ignore rules

- prevent accidental commits of local API key environment file

- keep environment guidance in workspace while excluding sensitive content
The llm-sanitizer package is installed locally via `pip install -e` but is
not yet published to PyPI and cannot be added to scripts/requirements.txt.
CI was failing with an ImportError at collection time because
`scripts/utils/prompt_safety.py` imported directly from `llm_sanitizer.*`.

Changes:
- Wrap the three `llm_sanitizer` imports in a try/except ImportError block
  and set `_HAS_LLM_SANITIZER = True/False`.
- Guard the scanner-based passes in `scan_text_for_injection`,
  `sanitize_instruction_text`, and `scan_for_safety_alert` so they
  short-circuit gracefully when the library is absent.
- The fast substring pass in `scan_text_for_injection` always runs,
  preserving injection detection for the most common cv-builder cases
  without the library.
- `min_risk` parameter type changed from `RiskLevel` (import-time default)
  to `Any = None`, resolved to `RiskLevel.high` at call time when the
  library is available.

Local test: 306 passed (CI-equivalent command).
- Correct persona tally consistency in the rollup

- Add technical persona gap synthesis and updated totals

- Normalize markdown formatting for clean lint diagnostics
…date requirements

- Remove optional import logic and guards from prompt_safety.py
- Restore direct imports for RiskLevel, ScanResult, redact, Scanner
- Add llm-sanitizer>=0.1.0 to requirements.txt for CI and local installs
- Update UI browser restore test for new layout review state
- Minor workspace config tweak for uv

All Python tests pass (306/306) in cvgen environment.
…V tab, and first-run onboarding

- GAP-49 (spell-check confirm gate): add _confirmProceedToGenerate() modal
  before CV generation in both empty and full spell-check submit paths;
  shows current ATS score and staleness warning before proceeding
- GAP-30 (cover letter opening style): add formal/hook/narrative opening
  style selector to cover letter form; _OPENING_GUIDANCE dict in backend
  drives per-style LLM prompt; opening_style persisted in session state
- GAP-41 (Master CV tab pre-job): expose 'master' tab in STAGE_TABS.job
  so Master CV editor is accessible before job description is entered
- GAP-36 (first-run onboarding): session-free /api/setup/master-cv-status
  and /api/setup/create-master-cv endpoints; showOnboardingModal() /
  onboardingCreateEmptyProfile() JS functions; onboarding modal HTML in
  index.html; 7 new JS tests covering all onboarding paths
- fix(tests/ui): change Playwright goto wait_until from networkidle to
  load across all 8 conftest fixtures to eliminate a pre-existing flaky
  timeout; _wait_for_ui_ready() already ensures JS readiness
Enhance the onboarding modal (introduced in GAP-36) into a general
welcome/orientation screen shown on every startup until explicitly dismissed.
Both variants now share a complete workflow overview.

Changes:
- web/index.html: restructure onboarding modal body with shared 3-step
  workflow section (colored numbered badges: build master profile -> target
  job -> harvest improvements) and variant-specific CTA banners (green for
  present, amber for missing); "Don't show again" checkbox on present footer
- web/session-manager.js: add _setWelcomeSection(), maybeShowWelcomeModal()
  (checks localStorage dismissal flag, queries /api/setup/master-cv-status,
  shows appropriate section), and closeWelcomeModal() (hides overlay,
  optionally persists dismissal flag); update showOnboardingModal(); export
  2 new symbols
- web/app.js: call maybeShowWelcomeModal() in init() after ensureSessionContext
- web/bundle.js: regenerated (2717.9 KB)
- tests/js/session-manager.test.js: update DOM fixtures; update
  showOnboardingModal test to assert section visibility; add 7 new tests
  covering maybeShowWelcomeModal and closeWelcomeModal; fix localStorage
  cleanup with try/catch

Closes GAP-37
master_data_routes.py omitted certifications from the /api/master-data/full
response, causing the Certifications section in the Master CV editor to
always render empty despite correct data in Master_CV_Data.json.

Changes:
- scripts/routes/master_data_routes.py: add
  "certifications": master.get('certifications', []) to master_data_full()
  response payload

Closes GAP-42
Replace the non-functional undoInstruction() stub with snapshot-based undo.
Before each layout instruction is applied, the current HTML and instruction
list are pushed onto an in-memory undo stack (capped at 20 entries). When
Undo is pressed, the last snapshot is popped and restored: preview iframe,
stateManager CV artifacts, and instruction history are all rolled back.

Changes:
- web/layout-instruction.js: add module-level _layoutUndoStack / _UNDO_STACK_MAX;
  push snapshot in submitLayoutInstruction() before API call; replace stub
  undoInstruction() with stack-pop restore logic
- web/bundle.js: regenerated (2719.1 KB)
- tests/js/layout-instruction.test.js: add submitLayoutInstruction to imports;
  rewrite undoInstruction describe block with 4 tests covering empty-stack
  no-op, snapshot restore of HTML and instructions, system message, and
  absent-layoutInstructions guard

Closes GAP-25
Resolves the 79-record ambiguous set from the initial /specstoryOrganize
pass (55 unique files after cross-workspace duplicates were identified).

Changes:
- Deleted 21 empty/stub sessions (untitled sessions, "File editing
  session" placeholders, and near-empty sessions under 300B)
- Moved 8 personal-research sessions (diabetic neuropathy knowledge
  system work) to ~/CV/llm-history/ instead of a code repo
- Resolved 23 cross-workspace duplicate groups: kept the canonical copy
  in the correct repo, deleted or moved the duplicates accordingly
- Routed 3 single-copy files to their correct repos (vscode-config)
- Annotated all kept copies with specstory-relocated metadata
…ons"; GAP-29 venue warning

GAP-28: cv-template.html conditionally rendered "Publications" when no count
suffix was needed. Heading now always reads "Selected Publications"; the count
suffix (n) is appended only when some publications were omitted.

GAP-29: _format_publications() in cv_orchestrator.py never set venue_warning,
so the .pub-venue-warn icon in the template was never triggered. Now sets
venue_warning to a descriptive message for any entry missing both journal and
booktitle fields; entries with a venue continue to receive an empty string.

Changes:
- templates/cv-template.html: always emit "Selected Publications"; move count
  suffix outside the conditional so it only appears when pubs were omitted
- scripts/utils/cv_orchestrator.py: add venue_warning field to every entry in
  _format_publications() based on presence of journal or booktitle

Closes GAP-28
Closes GAP-29
…ublications" for all

Update cv-template.html and requirements to reflect the correct heading rule:
- "Selected Publications" when a subset of accepted publications is shown
- "Publications" when all accepted publications are shown
- Count suffix (N) never appears in generated documents

Also fix stale evidence in review-status/hiring-manager.md where two rows
still described venue_warning as unimplemented (fixed by GAP-29, ad9edf0).

Changes:
- templates/cv-template.html: conditional heading in both HTML section and
  ATS plain-text block
- tasks/user-story-hiring-manager.md: US-M7 item 5 and acceptance criteria
  updated to specify the subset/full distinction and prohibit count suffix
- tasks/gaps.md: GAP-24 description and GAP-28 status updated; count context
  removed from recommended resolution
- tasks/ui-review.md: GAP-28 finding marked CLOSED
- tasks/review-status/hiring-manager.md: heading and venue-warning rows
  updated to Pass/Partial with accurate evidence

Amends behavior introduced in ad9edf0.
…compute time

Previously, ats_score (from POST /api/cv/ats-score) and validation_results
(from GET /api/ats-validate) were only written to metadata.json during
POST /api/finalise. If finalise was never called, those values were lost.

Fix: add a _try_patch_metadata() helper to both generation_routes.py and
review_routes.py. After each value is computed and stored in session state,
the helper immediately patches it into the session's metadata.json file.
The helper is fire-and-forget — exceptions are logged as warnings but never
propagated, so no existing error paths change.

Affected routes:
- POST /api/cv/ats-score  → patches {"ats_score": score} into metadata.json
- GET  /api/ats-validate  → patches {"validation_results": {...}} into metadata.json
showAlertModal and closeAlertModal were defined in both ui-core.js and
ui-helpers.js. The ui-core.js versions targeted non-existent element IDs
(alert-modal, alert-title, alert-message) and created a second hidden
overlay dynamically, making them effectively dead code.

The ui-helpers.js versions are canonical: they target the correct HTML
elements (alert-modal-overlay, alert-modal-title, alert-modal-message)
and integrate with the focus-trap helpers (trapFocus, restoreFocus,
setInitialFocus) exported by ui-core.js.

Changes:
- Remove showAlertModal + closeAlertModal function bodies and export
  from ui-core.js
- Remove stale "also defined in ui-core.js" comment from ui-helpers.js
- Rebuild web/bundle.js
…cknowledgement

Require the user to expand the persuasion warnings panel and click
"Acknowledged" before the "Submit All Decisions" button becomes enabled
and before submitRewriteDecisions() will proceed.

Changes:
- updateRewriteTally(): disable submit button when persuasionWarningsAcknowledged
  is false (in addition to the existing pending > 0 check)
- Acknowledged button onclick: call updateRewriteTally() after setting the
  flag so the submit button re-evaluates immediately
- submitRewriteDecisions(): hard guard — shows an alert and returns early if
  warnings have not been acknowledged (defence in depth)
- Add setPersuasionWarningsAcknowledged() setter for testability
- Update rewrite-review.test.js: set acknowledged=true in the
  "enables submit button when no pending cards remain" test
- Rebuild web/bundle.js
… Review tab

Backend:
- POST /api/cover-letter/save now appends the generated cover letter
  filename to generated_files['files'] in session state so it is
  registered for download and visible to the frontend.
- POST /api/screening/save does the same for the screening responses DOCX.

Frontend (download-tab.js / bundle.js):
- _collectDownloadableFiles() recognises CoverLetter_* and
  Screening_Responses_* filename prefixes and renders them with
  appropriate descriptions instead of the generic DOCX label.

Both artefacts are now served by the existing /api/download/<filename>
route via the generated_files['files'] list lookup.
…abels

SESSION_PHASE_LABELS in utils.js now maps every backend phase enum to a
human-readable title-case string (e.g. 'rewrite_review' → 'Rewrite Review',
'init' → 'Getting Started') rather than lowercase abbreviations.

session-manager.js: loadSessionFile restore message now reads from
SESSION_PHASE_LABELS via an inline lookup instead of embedding the raw
enum value (e.g. '(rewrite_review)' → '(Rewrite Review)').

Tests updated to match:
- session-manager.test.js: restore-message expectation already set to
  '(Rewrite Review)' — now passes.
- session-switcher.test.js: formatSessionPhaseLabel expectations updated
  from short abbreviated values to full canonical labels ('Rewrite Review',
  'Layout Review').

1108/1108 JS tests pass.
On the first analyzeJob() call, append a system message informing the
user that submitted content is sent to the configured LLM provider.
The flag is persisted in localStorage (StorageKeys.LLM_DISCLOSURE_SHOWN)
so the message shows only once per browser profile.

Changes:
- api-client.js: add LLM_DISCLOSURE_SHOWN key to StorageKeys.
- job-analysis.js: import StorageKeys; check+set flag at top of
  analyzeJob() before any fetch.
- tests/js/job-analysis.test.js: two new tests — shows disclosure on
  first call; suppresses it when flag already set.
- tests/js/api-client.test.js: update key-count assertion 5 → 6 and add
  assertion for new LLM_DISCLOSURE_SHOWN key.

1111/1111 JS tests pass.
…ter_data_routes._save_master

The _save_master helper in master_data_routes.py now mirrors the behaviour
already present in web_app.py:

1. Creates a timestamped backup before writing (unchanged).
2. Calls validate_master_data() on the in-memory dict immediately after
   the write.
3. If validation fails, restores the backup and raises ValueError so the
   calling route can surface the error rather than silently persisting
   invalid data.

91 Python master-data tests pass.
session-switcher-ui.js:
- Session row button text: 'Delete' → 'Move to Trash'
- Button title attribute: 'Delete session' → 'Move session to Trash'
- Saved-sessions section note: 'rename, or delete saved work' →
  'rename, or move saved work to Trash'

Permanent-deletion labels in the Trash view ('Delete Forever',
'Permanently delete…') are intentionally unchanged — they accurately
describe an irreversible action on already-trashed items.

1111/1111 JS tests pass.
…trol

layout-instruction.js:
- Add `pxToPt(px)` helper (exported): converts CSS px to typographic pt
  using 96 px/in × 72 pt/in convention (1px = 0.75pt), rounded to 1dp
- Rename label from 'Base font size (px):' to 'Base font size:' — unit
  now appears inline in the adjacent pt-display span
- Add `<span id="font-size-pt-display">` next to the input showing the
  current value as 'Npx (M.m pt)'
- Wire `input` event on base-font-size-input to update the span live
- On session-state restore, update the pt span alongside the input value
- Default display: 13 px (9.8 pt)

tests/js/layout-instruction.test.js:
- Import pxToPt
- Add 5-case `pxToPt` describe block covering default value, round
  numbers, minimum, and rounding-to-1dp behaviour

1116/1116 JS tests pass.
master_data_routes.py (cover_letter_generate):
- Prompt instruction changed from '~300–400 words' to '~250–300 words'

The front-end validation range (250–400) is intentionally left unchanged
in this change — see GAP-HM-04 for planned role-differentiated validation.

40/40 cover-letter Python tests pass.
cv-builder.code-workspace:
- Add ../linkedown as a new sibling-repo workspace folder entry
@warnes warnes merged commit 4d638d5 into devel Apr 22, 2026
8 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant