Skip to content

[codex] Reclaim abandoned queue reservations#1846

Open
RoyZhao1991 wants to merge 1 commit into
orchestration-agent:mainfrom
RoyZhao1991:codex/reclaim-abandoned-reservations-1835
Open

[codex] Reclaim abandoned queue reservations#1846
RoyZhao1991 wants to merge 1 commit into
orchestration-agent:mainfrom
RoyZhao1991:codex/reclaim-abandoned-reservations-1835

Conversation

@RoyZhao1991
Copy link
Copy Markdown

Summary

  • adds reservation metadata to dequeued scheduler jobs, including worker owner, expiry, and a reservation token
  • reclaims worker-disconnect and expired reservations idempotently back to the original queue without changing task ids
  • rejects stale completion/failure attempts that present an old reservation token after a task is redelivered
  • records audit events for reservations, reclaims, rejections, completions, and retries without including task payload data
  • exports AgentStatus and uses a re-entrant metrics lock so the current full suite passes on main

Validation

  • uv run pytest tests/test_scheduler.py -q
  • uv run pytest -q
  • uv run flake8 src/orchestrator/scheduler.py tests/test_scheduler.py src/agent/__init__.py src/common/metrics.py
  • uv run python -m py_compile src/orchestrator/scheduler.py tests/test_scheduler.py src/agent/__init__.py src/common/metrics.py
  • git diff --check

Fixes #1835

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ Bounty $5k ] [ Queue ] Reclaim abandoned reserved jobs — worker disconnects

1 participant