Skip to content

Comments

feat: isolate async handler execution on dedicated worker event loop#273

Open
siwachabhi wants to merge 1 commit intomainfrom
feat/worker-loop-async-isolation
Open

feat: isolate async handler execution on dedicated worker event loop#273
siwachabhi wants to merge 1 commit intomainfrom
feat/worker-loop-async-isolation

Conversation

@siwachabhi
Copy link
Contributor

Summary

  • Introduces a dedicated persistent worker event loop in a background daemon thread to isolate all async handler execution from the main uvicorn event loop
  • Async handlers containing blocking calls (e.g. time.sleep, synchronous HTTP) previously froze /ping health checks, causing container termination — this fix ensures /ping always responds promptly
  • Implements three-way handler dispatch: async generators bridged via queue.Queue, regular async via run_coroutine_threadsafe, sync via run_in_threadpool
  • Propagates contextvars across event loop boundaries using copy_context() + Django asgiref _restore_context pattern (Python 3.10+ compatible)

Test plan

  • 18 new tests in TestWorkerLoopInvocation covering:
    • Async handler runs on worker thread (not main)
    • Sync handler runs in thread pool (worker loop NOT created)
    • Blocking async handler does NOT block /ping
    • Context propagation to async and sync handlers
    • Async generator bridging and streaming
    • Fire-and-forget create_task survives handler return
    • Exception propagation from worker loop
    • Lazy initialization and loop reuse
    • Concurrent async invocations
    • functools.partial and callable class dispatch
    • HEALTHY_BUSY ping status during background tasks
    • End-to-end HTTP tests for sync streaming, async, and async gen streaming
  • Updated existing test_async_handler_runs_on_worker_loop to assert worker thread isolation
  • All 105 runtime tests pass
  • Full test suite: 950 passed, 0 failures
  • Pre-commit hooks (ruff lint, ruff format, trailing whitespace) all pass

Async handlers that contain blocking calls (e.g. time.sleep, synchronous
HTTP requests) previously ran on the main uvicorn event loop, freezing
/ping health checks and causing container termination. This introduces a
dedicated persistent worker event loop in a background thread that
isolates all handler execution from the main loop.

Three-way handler dispatch:
- Async generators: bridged to sync generators via queue.Queue on worker loop
- Regular async: run_coroutine_threadsafe + wrap_future on worker loop
- Sync: run_in_threadpool (starlette thread pool)

Context propagation uses contextvars.copy_context() with the Django
asgiref _restore_context pattern for Python 3.10+ compatibility.
@siwachabhi siwachabhi requested a review from a team February 21, 2026 10:04
Copy link
Contributor

@jariy17 jariy17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

"""Check if obj is async-callable, unwrapping functools.partial."""
while isinstance(obj, functools.partial):
obj = obj.func
return asyncio.iscoroutinefunction(obj) or (callable(obj) and asyncio.iscoroutinefunction(obj.__call__))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This only unwraps functools.partial but doesn't follow __wrapped__ chains set by functools.wraps. A decorated async function would be silently misclassified as sync and dispatched to the thread pool instead of the worker loop:

def my_decorator(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        return func(*args, **kwargs)
    return wrapper

@my_decorator
async def handler(payload):
    return {"ok": True}

# _is_async_callable(handler) → False (checks wrapper, a regular def)

Unverified suggestion — use inspect.unwrap() to follow __wrapped__ chains (has a built-in 200-hop safety limit to prevent cycles):

def _unwrap(obj: Any) -> Any:
    """Unwrap functools.partial and decorator chains."""
    while isinstance(obj, functools.partial):
        obj = obj.func
    try:
        obj = inspect.unwrap(obj)
    except StopIteration:
        pass  # cycle detected, use as-is
    return obj


def _is_async_callable(obj: Any) -> bool:
    obj = _unwrap(obj)
    return asyncio.iscoroutinefunction(obj) or (
        callable(obj) and asyncio.iscoroutinefunction(obj.__call__)
    )


def _is_async_gen_callable(obj: Any) -> bool:
    obj = _unwrap(obj)
    return inspect.isasyncgenfunction(obj) or (
        callable(obj) and inspect.isasyncgenfunction(obj.__call__)
    )

uvicorn.run(self, **uvicorn_params)

async def _invoke_handler(self, handler, request_context, takes_context, payload):
def _ensure_worker_loop(self) -> asyncio.AbstractEventLoop:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sync generator returned by this method is iterated by Starlette via iterate_in_threadpool, which holds a thread for the entire duration of the stream. The thread spends most of its time blocked on q.get() waiting for the next chunk from the worker loop.

Under concurrent streaming workloads (e.g., LLM streaming responses each taking ~20 seconds), this can exhaust the default thread pool (~40 threads) ref. Once exhausted, all new requests are queued waiting for a thread to free up, even though the occupied threads are mostly idle waiting on q.get().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one is expected though, customers at max have 2 concurrent connections to runtime servers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants