-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Part of: Dependency Map reliability improvements
[Conversation Reference: "User reported that clicking Repair on the dependency map page gives zero visual feedback -- no progress bar, no activity journal, no Running badge. Full Analysis and Delta Refresh both show live activity journal with progress, but Repair bypasses all three feedback mechanisms. User also confirmed that the background job tracker must be used, and that Claude should report to the journal (which happens automatically once the journal is initialized, since the prompt appendix is already wired but journal_path resolves to None when journal is not initialized)."]
Story Overview
As a CIDX server admin
I want to see real-time progress feedback when running dependency map Repair
So that I know the repair is running, what phase it's in, and when it completes -- just like Full Analysis and Delta Refresh already show.
User Value: Admin gets immediate visual confirmation that repair is working, can monitor which phase is active, and sees Claude's exploration activity in real-time -- eliminating the current "silent black hole" experience.
Root Cause Analysis
Three distinct gaps cause zero feedback during repair:
- Journal not initialized -- Repair never calls
_activity_journal.init(), so alljournal.log()calls are no-ops (swallowed byif not self._activeguard). This also meansjournal_pathisNone, so the Claude CLI prompt appendix with journal instructions is skipped. - No job tracker / tracking backend registration -- Repair runs in a raw
threading.Threadwithout registering with_job_trackeror updating the tracking backend status, so progress percentage is always 0% and the "Running" badge never appears. - UI not wired -- The Repair button click has no JS handler to reset/show the activity journal panel (unlike Full Analysis/Delta which clear entries, reset offset, and show the panel).
[Conversation Reference: "User said 'you need to make sure also that the background job tracker is used also' and 'you need to make sure also you nudge claude to also report to the journal. I saw it working properly in a full job recently.' Investigation confirmed the journal prompt appendix is already wired in run_pass_2_per_domain() but only activates when journal_path is not None -- which requires journal initialization."]
Implementation Status
- Initialize activity journal before repair starts (enables Python-side logging + Claude prompt journal instructions)
- Update tracking backend status to "running" before repair starts, "completed"/"failed" on finish (drives the "Running" badge with blinking green dot)
- Register repair with job tracker (operation_type="dependency_map_repair") and report progress % at each of the 5 phases
- Update
_get_progress_from_service()in routes to include "dependency_map_repair" operation type - Wire Repair button JS to reset/show activity journal panel (same pattern as Full Analysis/Delta)
- Clean up journal on repair completion
- Unit tests (0/0 passing)
- E2E manual testing on staging
Completion: 0/8 tasks complete (0%)
Acceptance Criteria
AC1: Running State Badge
Given the admin is on the Dependency Map page
When they click the Repair button
Then the job status section shows "Running" with the blinking green dot
And the status auto-refreshes every 5 seconds
And the Running state persists until repair completesTechnical Requirements:
- Update tracking backend status to "running" before repair thread starts
- Update tracking backend status to "completed" or "failed" when repair finishes
- Repair must use the same tracking mechanism as Full Analysis/Delta
AC2: Activity Journal Panel Appears
Given the admin clicks the Repair button
When repair starts executing
Then the Activity Journal panel becomes visible
And previous journal entries are cleared
And the offset is reset to 0
And the panel polls for new entries every 3 secondsTechnical Requirements:
- Add JS handler for Repair button in
dependency_map.htmlthat resets journal panel (same pattern as Full Analysis/Delta trigger handler) - Initialize
_activity_journalwith appropriate path before repair executor runs - Journal path must exist and be writable so
journal.log()calls succeed
AC3: Progress Bar Updates
Given repair is running
When each repair phase completes (Phase 1-5)
Then the progress bar and percentage update accordingly
And progress info text shows the current phase nameTechnical Requirements:
- Register repair job with
_job_tracker(operation_type="dependency_map_repair") - Call
_job_tracker.update_status()with progress milestones at each phase boundary - Update
_get_progress_from_service()to recognize "dependency_map_repair" alongside "dependency_map_full" and "dependency_map_delta" - Suggested milestones: Phase 1 (10-60%), Phase 2 (65%), Phase 3 (70%), Phase 4 (80%), Phase 5 (90%), Complete (100%)
AC4: Claude Reports to Journal
Given repair Phase 1 re-analyzes broken domains via Claude CLI
When Claude receives the Pass 2 prompt
Then the prompt includes activity journal instructions (via _build_activity_journal_appendix)
And Claude appends exploration entries to the journal file
And those entries appear in the Activity Journal panel in real-timeTechnical Requirements:
- This works automatically once journal is initialized --
journal_pathwill be non-None, so_build_activity_journal_appendix()is included in the Pass 2 prompt - Verify the journal path passed through the
_build_domain_analyzer()closure is the initialized journal path
AC5: Completion State
Given repair finishes (success or failure)
When the final phase completes
Then the progress bar shows 100%
And the tracking backend status updates to "completed" or "failed"
And the job status badge returns to the health state (Healthy/Needs Repair)
And the activity journal panel hides after 8 seconds (existing behavior)Technical Requirements:
- Ensure tracking backend status is updated in
finallyblock (handles both success and failure) - Ensure journal is cleaned up (finalized) on completion
- The existing JS auto-hide logic (X-Journal-Active=0 + 8-second timeout) handles panel dismissal
Key Files
| File | Change |
|---|---|
src/code_indexer/server/web/dependency_map_routes.py |
Initialize journal, register job tracker, update tracking backend, update progress lookup |
src/code_indexer/server/web/templates/dependency_map.html |
Add JS handler for Repair button to reset/show journal panel |
src/code_indexer/server/services/dep_map_repair_executor.py |
Accept progress callback, report progress at phase boundaries |
Technical Context
Existing Patterns to Follow
Full Analysis feedback pattern (in dependency_map_service.py):
_activity_journal.init(staging_dir)-- initializes journal with path_job_tracker.register_job(...)-- registers with operation_type_job_tracker.update_status(job_id, progress=N, progress_info="...")-- milestones- Tracking backend updated to "running" / "completed" / "failed"
JS trigger handler pattern (in dependency_map.html):
- On trigger button click: clear
#depmap-activity-entries, reset offset to 0, reinstatehx-trigger="every 3s", callhtmx.process(), show panel, reset progress bar to 0%
Journal Initialization
The _activity_journal.init(path) method sets _active=True and creates the _activity.md file at the given path. For repair, an appropriate path would be a temporary directory (similar to how delta uses ~/.tmp/depmap-delta-journal/).
Tracking Backend vs Job Tracker
- Tracking backend (SQLite): Stores status ("running"/"completed"/"failed") -- drives the "Running" badge in
depmap_job_status.html - Job tracker: Stores operation_type, progress %, progress_info -- drives the progress bar via
_get_progress_from_service()andX-Journal-Progressheaders
Both must be updated for complete feedback.
Testing Requirements
Unit Test Coverage
- Test journal is initialized before repair starts
- Test tracking backend status updates to "running" then "completed"/"failed"
- Test job tracker registration with operation_type="dependency_map_repair"
- Test
_get_progress_from_service()recognizes repair operation type - Test progress milestones are reported at each phase
- Test journal cleanup on completion
E2E Manual Testing
- Deploy to staging, ensure at least 1 anomaly exists
- Click Repair, observe Running badge appears
- Observe Activity Journal panel with entries
- Observe progress bar advancing through phases
- Observe Claude exploration entries in journal (Phase 1 only -- when broken domains exist)
- Wait for completion, verify badge returns to health state
- Verify journal file exists on disk with entries via SSH
Error Handling
- If journal initialization fails, repair should still proceed (log warning, but don't block repair)
- If job tracker registration fails, repair should still proceed (degrade gracefully to no-progress feedback)
- Tracking backend status MUST be updated in
finallyblock to prevent stuck "Running" state
Definition of Done
- All acceptance criteria satisfied with evidence
- Unit tests passing
- fast-automation.sh passes
- Code review approved
- Manual E2E testing on staging server
- No lint/type errors
- Working software deployed and verified