Skip to content

[STORY] Dependency Map Repair Job Feedback #352

@jsbattig

Description

@jsbattig

Part of: Dependency Map reliability improvements

[Conversation Reference: "User reported that clicking Repair on the dependency map page gives zero visual feedback -- no progress bar, no activity journal, no Running badge. Full Analysis and Delta Refresh both show live activity journal with progress, but Repair bypasses all three feedback mechanisms. User also confirmed that the background job tracker must be used, and that Claude should report to the journal (which happens automatically once the journal is initialized, since the prompt appendix is already wired but journal_path resolves to None when journal is not initialized)."]

Story Overview

As a CIDX server admin
I want to see real-time progress feedback when running dependency map Repair
So that I know the repair is running, what phase it's in, and when it completes -- just like Full Analysis and Delta Refresh already show.

User Value: Admin gets immediate visual confirmation that repair is working, can monitor which phase is active, and sees Claude's exploration activity in real-time -- eliminating the current "silent black hole" experience.

Root Cause Analysis

Three distinct gaps cause zero feedback during repair:

  1. Journal not initialized -- Repair never calls _activity_journal.init(), so all journal.log() calls are no-ops (swallowed by if not self._active guard). This also means journal_path is None, so the Claude CLI prompt appendix with journal instructions is skipped.
  2. No job tracker / tracking backend registration -- Repair runs in a raw threading.Thread without registering with _job_tracker or updating the tracking backend status, so progress percentage is always 0% and the "Running" badge never appears.
  3. UI not wired -- The Repair button click has no JS handler to reset/show the activity journal panel (unlike Full Analysis/Delta which clear entries, reset offset, and show the panel).

[Conversation Reference: "User said 'you need to make sure also that the background job tracker is used also' and 'you need to make sure also you nudge claude to also report to the journal. I saw it working properly in a full job recently.' Investigation confirmed the journal prompt appendix is already wired in run_pass_2_per_domain() but only activates when journal_path is not None -- which requires journal initialization."]

Implementation Status

  • Initialize activity journal before repair starts (enables Python-side logging + Claude prompt journal instructions)
  • Update tracking backend status to "running" before repair starts, "completed"/"failed" on finish (drives the "Running" badge with blinking green dot)
  • Register repair with job tracker (operation_type="dependency_map_repair") and report progress % at each of the 5 phases
  • Update _get_progress_from_service() in routes to include "dependency_map_repair" operation type
  • Wire Repair button JS to reset/show activity journal panel (same pattern as Full Analysis/Delta)
  • Clean up journal on repair completion
  • Unit tests (0/0 passing)
  • E2E manual testing on staging

Completion: 0/8 tasks complete (0%)

Acceptance Criteria

AC1: Running State Badge

Given the admin is on the Dependency Map page
When they click the Repair button
Then the job status section shows "Running" with the blinking green dot
And the status auto-refreshes every 5 seconds
And the Running state persists until repair completes

Technical Requirements:

  • Update tracking backend status to "running" before repair thread starts
  • Update tracking backend status to "completed" or "failed" when repair finishes
  • Repair must use the same tracking mechanism as Full Analysis/Delta

AC2: Activity Journal Panel Appears

Given the admin clicks the Repair button
When repair starts executing
Then the Activity Journal panel becomes visible
And previous journal entries are cleared
And the offset is reset to 0
And the panel polls for new entries every 3 seconds

Technical Requirements:

  • Add JS handler for Repair button in dependency_map.html that resets journal panel (same pattern as Full Analysis/Delta trigger handler)
  • Initialize _activity_journal with appropriate path before repair executor runs
  • Journal path must exist and be writable so journal.log() calls succeed

AC3: Progress Bar Updates

Given repair is running
When each repair phase completes (Phase 1-5)
Then the progress bar and percentage update accordingly
And progress info text shows the current phase name

Technical Requirements:

  • Register repair job with _job_tracker (operation_type="dependency_map_repair")
  • Call _job_tracker.update_status() with progress milestones at each phase boundary
  • Update _get_progress_from_service() to recognize "dependency_map_repair" alongside "dependency_map_full" and "dependency_map_delta"
  • Suggested milestones: Phase 1 (10-60%), Phase 2 (65%), Phase 3 (70%), Phase 4 (80%), Phase 5 (90%), Complete (100%)

AC4: Claude Reports to Journal

Given repair Phase 1 re-analyzes broken domains via Claude CLI
When Claude receives the Pass 2 prompt
Then the prompt includes activity journal instructions (via _build_activity_journal_appendix)
And Claude appends exploration entries to the journal file
And those entries appear in the Activity Journal panel in real-time

Technical Requirements:

  • This works automatically once journal is initialized -- journal_path will be non-None, so _build_activity_journal_appendix() is included in the Pass 2 prompt
  • Verify the journal path passed through the _build_domain_analyzer() closure is the initialized journal path

AC5: Completion State

Given repair finishes (success or failure)
When the final phase completes
Then the progress bar shows 100%
And the tracking backend status updates to "completed" or "failed"
And the job status badge returns to the health state (Healthy/Needs Repair)
And the activity journal panel hides after 8 seconds (existing behavior)

Technical Requirements:

  • Ensure tracking backend status is updated in finally block (handles both success and failure)
  • Ensure journal is cleaned up (finalized) on completion
  • The existing JS auto-hide logic (X-Journal-Active=0 + 8-second timeout) handles panel dismissal

Key Files

File Change
src/code_indexer/server/web/dependency_map_routes.py Initialize journal, register job tracker, update tracking backend, update progress lookup
src/code_indexer/server/web/templates/dependency_map.html Add JS handler for Repair button to reset/show journal panel
src/code_indexer/server/services/dep_map_repair_executor.py Accept progress callback, report progress at phase boundaries

Technical Context

Existing Patterns to Follow

Full Analysis feedback pattern (in dependency_map_service.py):

  1. _activity_journal.init(staging_dir) -- initializes journal with path
  2. _job_tracker.register_job(...) -- registers with operation_type
  3. _job_tracker.update_status(job_id, progress=N, progress_info="...") -- milestones
  4. Tracking backend updated to "running" / "completed" / "failed"

JS trigger handler pattern (in dependency_map.html):

  • On trigger button click: clear #depmap-activity-entries, reset offset to 0, reinstate hx-trigger="every 3s", call htmx.process(), show panel, reset progress bar to 0%

Journal Initialization

The _activity_journal.init(path) method sets _active=True and creates the _activity.md file at the given path. For repair, an appropriate path would be a temporary directory (similar to how delta uses ~/.tmp/depmap-delta-journal/).

Tracking Backend vs Job Tracker

  • Tracking backend (SQLite): Stores status ("running"/"completed"/"failed") -- drives the "Running" badge in depmap_job_status.html
  • Job tracker: Stores operation_type, progress %, progress_info -- drives the progress bar via _get_progress_from_service() and X-Journal-Progress headers

Both must be updated for complete feedback.

Testing Requirements

Unit Test Coverage

  • Test journal is initialized before repair starts
  • Test tracking backend status updates to "running" then "completed"/"failed"
  • Test job tracker registration with operation_type="dependency_map_repair"
  • Test _get_progress_from_service() recognizes repair operation type
  • Test progress milestones are reported at each phase
  • Test journal cleanup on completion

E2E Manual Testing

  • Deploy to staging, ensure at least 1 anomaly exists
  • Click Repair, observe Running badge appears
  • Observe Activity Journal panel with entries
  • Observe progress bar advancing through phases
  • Observe Claude exploration entries in journal (Phase 1 only -- when broken domains exist)
  • Wait for completion, verify badge returns to health state
  • Verify journal file exists on disk with entries via SSH

Error Handling

  • If journal initialization fails, repair should still proceed (log warning, but don't block repair)
  • If job tracker registration fails, repair should still proceed (degrade gracefully to no-progress feedback)
  • Tracking backend status MUST be updated in finally block to prevent stuck "Running" state

Definition of Done

  • All acceptance criteria satisfied with evidence
  • Unit tests passing
  • fast-automation.sh passes
  • Code review approved
  • Manual E2E testing on staging server
  • No lint/type errors
  • Working software deployed and verified

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions