[STORY] Debounced cidx-meta Refresh on Batch Repository Registration

# Story: Debounced cidx-meta Refresh on Batch Repository Registration

**As a** CIDX server administrator
**I want to** have all batch-registered repositories become searchable via cidx-meta after auto-discovery completes
**So that** newly discovered repos are reliably indexed without manual intervention, even when many repos are registered in rapid succession

---

## Problem Statement

When multiple repositories are registered in rapid succession (e.g., via the auto-discovery feature), each registration triggers `on_repo_added()` in `meta_description_hook.py` (line 178), which calls `_refresh_scheduler.trigger_refresh_for_repo("cidx-meta-global")`. The first call succeeds, but all subsequent calls during the same refresh cycle raise `DuplicateJobError` from `background_jobs.py` (line 268) because a `global_repo_refresh` job is already running for `cidx-meta-global`.

The current code catches this exception as a warning (line 183) but takes no further action. The `.md` description files ARE created on disk for all repos, but the cidx-meta index only reflects the state at the time of the first (successful) refresh. There is no retry or catch-up mechanism, so rejected repos permanently have unindexed descriptions until something else happens to trigger a reindex.

**Conversation reference**: User observed the warning `Failed to trigger cidx-meta refresh for [repo]: A 'global_repo_refresh' job is already running for repository 'cidx-meta-global'` after registering multiple repos via auto-discovery.

---

## Implementation Status

- [ ] `CidxMetaRefreshDebouncer` class with dirty flag and timer thread (`meta_description_hook.py`)
- [ ] Debounce timer logic: set dirty, reset timer on each signal, fire single refresh on expiry
- [ ] Integration into `on_repo_added()`: catch `DuplicateJobError` and signal debouncer instead of just warning
- [ ] Integration into `on_repo_removed()`: signal debouncer on `DuplicateJobError` (same pattern)
- [ ] Module-level lifecycle: `set_refresh_scheduler()` initializes debouncer, shutdown method for clean stop
- [ ] Thread safety: lock protecting dirty flag and timer reset operations
- [ ] Unit tests for debouncer logic (timer reset, coalescing, single-fire behavior)
- [ ] Unit tests for `on_repo_added()` / `on_repo_removed()` integration with debouncer
- [ ] Integration test simulating batch registration with DuplicateJobError scenario
- [ ] E2E manual testing: batch register repos, verify all descriptions become searchable

**Completion:** 0/10 tasks complete (0%)

---

## Algorithm

```
CidxMetaRefreshDebouncer:
  _dirty = False
  _timer = None
  _lock = threading.Lock()
  _debounce_seconds = 30  (configurable, default 30s)
  _refresh_scheduler = RefreshScheduler reference
  _shutdown = False

  signal_dirty():
    WITH _lock:
      _dirty = True
      IF _timer is not None:
        _timer.cancel()
      IF NOT _shutdown:
        _timer = threading.Timer(_debounce_seconds, _on_timer_expired)
        _timer.daemon = True
        _timer.start()
      LOG debug "cidx-meta marked dirty, debounce timer (re)started"

  _on_timer_expired():
    WITH _lock:
      IF NOT _dirty OR _shutdown:
        RETURN
      _dirty = False
      _timer = None
    # Outside lock: trigger the actual refresh
    TRY:
      _refresh_scheduler.trigger_refresh_for_repo("cidx-meta-global")
      LOG info "Debounced cidx-meta refresh triggered successfully"
    EXCEPT DuplicateJobError:
      # Still running from a previous cycle — re-mark dirty and retry later
      WITH _lock:
        _dirty = True
        IF NOT _shutdown:
          _timer = threading.Timer(_debounce_seconds, _on_timer_expired)
          _timer.daemon = True
          _timer.start()
      LOG info "cidx-meta refresh still running, will retry after debounce"
    EXCEPT Exception as e:
      LOG warning "Debounced cidx-meta refresh failed: {e}"

  shutdown():
    WITH _lock:
      _shutdown = True
      IF _timer is not None:
        _timer.cancel()
        _timer = None

on_repo_added (modified flow, lines 177-183):
  # After creating .md file...
  IF _refresh_scheduler is not None:
    TRY:
      _refresh_scheduler.trigger_refresh_for_repo("cidx-meta-global")
      LOG info "Triggered cidx-meta refresh after adding {repo_name}"
    EXCEPT DuplicateJobError:
      # Refresh already running — signal debouncer for deferred retry
      IF _debouncer is not None:
        _debouncer.signal_dirty()
        LOG info "cidx-meta refresh deferred (debounced) for {repo_name}"
      ELSE:
        LOG warning "cidx-meta refresh skipped for {repo_name}: no debouncer"
    EXCEPT Exception as e:
      LOG warning "Failed to trigger cidx-meta refresh for {repo_name}: {e}"

on_repo_removed (modified flow, lines 211-221):
  # Same pattern as on_repo_added — catch DuplicateJobError, signal debouncer
```

---

## Acceptance Criteria

```gherkin
Scenario: Single repo registration triggers immediate refresh
  Given the CIDX server is running with no active cidx-meta refresh jobs
  When a single golden repository is registered via auto-discovery
  Then on_repo_added creates the .md description file in cidx-meta
  And trigger_refresh_for_repo succeeds immediately (no DuplicateJobError)
  And the new repo description becomes searchable via cidx-meta query

Scenario: Batch registration coalesces into one deferred refresh
  Given the CIDX server is running
  And a cidx-meta refresh job is already running from the first registration
  When 5 additional repositories are registered in rapid succession
  Then each on_repo_added creates its .md description file on disk
  And each DuplicateJobError is caught and signals the debouncer
  And the debouncer resets its timer on each signal (coalescing)
  And exactly one deferred refresh fires after the debounce interval expires
  And all 6 repo descriptions (first + 5 deferred) are searchable after the deferred refresh completes

Scenario: Debounce timer resets on each new registration
  Given the debouncer has been signaled and a timer is running
  When another repository is registered before the timer expires
  Then the timer is cancelled and restarted with the full debounce interval
  And only one refresh fires after the final registration plus debounce interval

Scenario: Deferred refresh retries if still blocked
  Given the debounce timer has expired and the debouncer attempts a refresh
  When the refresh attempt also raises DuplicateJobError (previous refresh still running)
  Then the debouncer re-marks itself dirty
  And starts another debounce timer for a subsequent retry
  And eventually succeeds when the running refresh completes

Scenario: Server shutdown cancels debounce timer cleanly
  Given the debouncer has a pending timer
  When the server is shutting down
  Then the debounce timer is cancelled
  And no refresh is attempted after shutdown begins
  And no threads are left running
```

---

## Testing Requirements

### Unit Tests
- `CidxMetaRefreshDebouncer.signal_dirty()`: verify dirty flag set, timer started
- `CidxMetaRefreshDebouncer.signal_dirty()` called multiple times: verify timer resets (only one timer active)
- `CidxMetaRefreshDebouncer._on_timer_expired()`: verify dirty flag cleared, `trigger_refresh_for_repo` called exactly once
- `CidxMetaRefreshDebouncer._on_timer_expired()` with DuplicateJobError: verify re-marks dirty and schedules retry
- `CidxMetaRefreshDebouncer.shutdown()`: verify timer cancelled, no further signals processed
- `on_repo_added()` with DuplicateJobError: verify debouncer.signal_dirty() called instead of just warning
- `on_repo_removed()` with DuplicateJobError: verify debouncer.signal_dirty() called
- Thread safety: concurrent `signal_dirty()` calls do not corrupt state

### Integration Tests
- Simulate batch of 5 `on_repo_added()` calls where first succeeds and rest get DuplicateJobError
- Verify exactly one deferred refresh is triggered after debounce interval
- Verify all .md files are present on disk after the batch

### E2E Manual Testing
- Start local CIDX server at localhost:8000
- Register multiple repos using auto-discovery or sequential add_golden_repo calls
- Observe logs: first refresh succeeds, subsequent ones are debounced (not just warned)
- Wait for debounce interval to expire
- Query cidx-meta for descriptions of all registered repos
- Verify all repos are discoverable

---

## Key Files

| File | Relevance |
|------|-----------|
| `src/code_indexer/global_repos/meta_description_hook.py` | Primary change: add CidxMetaRefreshDebouncer, modify on_repo_added/on_repo_removed |
| `src/code_indexer/server/repositories/background_jobs.py` | Read-only: DuplicateJobError (line 40-50), conflict detection (line 164-189) |
| `src/code_indexer/global_repos/refresh_scheduler.py` | Read-only: trigger_refresh_for_repo (line 592-631), _submit_refresh_job (line 803-838) |
| `tests/unit/global_repos/test_meta_description_hook.py` | New/modified: unit tests for debouncer and integration |

---

## Definition of Done

- [ ] All acceptance criteria satisfied
- [ ] >90% unit test coverage for new debouncer class and modified hook functions
- [ ] Integration tests passing (batch registration scenario)
- [ ] E2E manual testing completed by Claude Code (batch register + verify searchability)
- [ ] Code review approved (tdd-engineer + code-reviewer workflow)
- [ ] No lint/type errors (ruff, black, mypy clean)
- [ ] fast-automation.sh passes with zero failures
- [ ] No regressions to existing single-repo registration flow
- [ ] Working software deployable to staging/production


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[STORY] Debounced cidx-meta Refresh on Batch Repository Registration #345

Story: Debounced cidx-meta Refresh on Batch Repository Registration

Problem Statement

Implementation Status

Algorithm

Acceptance Criteria

Testing Requirements

Unit Tests

Integration Tests

E2E Manual Testing

Key Files

Definition of Done

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

File	Relevance
`src/code_indexer/global_repos/meta_description_hook.py`	Primary change: add CidxMetaRefreshDebouncer, modify on_repo_added/on_repo_removed
`src/code_indexer/server/repositories/background_jobs.py`	Read-only: DuplicateJobError (line 40-50), conflict detection (line 164-189)
`src/code_indexer/global_repos/refresh_scheduler.py`	Read-only: trigger_refresh_for_repo (line 592-631), _submit_refresh_job (line 803-838)
`tests/unit/global_repos/test_meta_description_hook.py`	New/modified: unit tests for debouncer and integration

[STORY] Debounced cidx-meta Refresh on Batch Repository Registration #345

Description

Story: Debounced cidx-meta Refresh on Batch Repository Registration

Problem Statement

Implementation Status

Algorithm

Acceptance Criteria

Testing Requirements

Unit Tests

Integration Tests

E2E Manual Testing

Key Files

Definition of Done

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions