Commit 2a009e8
authored
Add subinterpreter parallelism with OWN_GIL support (#11)
* Add worker thread pool for high-throughput Python operations
Implement a general-purpose worker thread pool that eliminates per-request
GIL acquisition overhead. Each worker holds the GIL (or has its own
subinterpreter with OWN_GIL on Python 3.12+) and processes requests from
a shared MPSC queue.
Key features:
- Sync API: call, apply, eval, exec, asgi_run, wsgi_run
- Async API: all *_async variants returning request_id for non-blocking calls
- await/1,2 for waiting on async results
- Per-worker module caching to avoid reimport overhead
- Support for FREE_THREADED (3.13+), SUBINTERP (3.12+), and FALLBACK modes
* Fix eval locals_term initialization and add benchmark results
- Fix potential crash when locals_term is uninitialized (check for 0)
- Add benchmark results directory with baseline comparisons
Known issue: ~0.5-1% of concurrent sync calls may timeout under high
load (100+ concurrent callers). Async API unaffected.
* Fix two race conditions in worker pool
1. Use-after-free on request_id: Save request_id BEFORE enqueueing
the request to the worker pool. Once enqueued, a worker can
process and free the request at any time. Accessing req->request_id
after py_pool_enqueue() is undefined behavior.
2. Double-free of msg_env: After a successful enif_send(), the message
environment is consumed/invalidated by the Erlang runtime. We must
set req->msg_env = NULL to prevent py_pool_request_free() from
calling enif_free_env() on an already-freed environment.
These bugs caused ~0.5-1% of concurrent calls to timeout under high load
because request IDs could be corrupted, leading to message/response
mismatch.
Also adds debug counters (responses_sent, responses_failed) to pool stats
for monitoring send success rate.
* Fix worker pool ASGI to use hornbeam run_asgi interface
Changed py_pool_process_asgi to call run_asgi(module_name, callable_name,
scope, body) instead of run(app, scope, body), matching hornbeam's
hornbeam_asgi_runner interface.
Also updated extract_asgi_response to handle both dict and tuple return
formats, supporting hornbeam's dict-based response.
* Add py_resource_pool and subinterpreter support with mutex locking
- Add compile-time detection of PyInterpreterConfig_OWN_GIL (Python 3.12+)
- Add mutex to py_subinterp_worker_t for thread-safe parallel access
- Add nif_subinterp_asgi_run for ASGI on subinterpreters
- Add py_resource_pool module with lock-free round-robin scheduling
- Benchmark shows 8-10x improvement with subinterpreters enabled
* Implement process-per-context architecture with reentrant callbacks
Replace worker pool with process-per-context model where each Python context
is owned by a dedicated Erlang process. Enables reentrant callbacks via
suspension-based mechanism without deadlock.
- Add py_context.erl with recursive receive pattern for inline callback handling
- Add py_context_router.erl for scheduler-affinity based routing
- Add nif_context_resume for Python replay with cached callback results
- Support sequential callbacks via callback_results array accumulation
- Remove old pool modules (py_pool, py_worker, py_worker_pool, etc.)
* Fix timeout handling and add contexts_started helper
- Pass timeout parameter through py:eval/3 and do_call/5
- Add py:contexts_started/0 and py_context_router:is_started/0
- Fix test_timeout to use time.sleep for reliable delay
- Fix thread callback suite to check existing contexts
* Fix thread worker handlers not re-registering after app restart
When the application restarts, py_thread_handler registers as the new
coordinator, but existing thread workers in the NIF-level pool still
had has_handler=true from the previous run. This caused them to skip
spawning new handler processes and write to dead pipes.
Reset has_handler=false on all existing workers when a new coordinator
is registered.
* Fix subinterpreter cleanup and thread worker re-registration
Two fixes:
1. suspended_context_state_destructor: For subinterpreters with OWN_GIL,
use PyThreadState_Swap to switch to the correct interpreter before
releasing Python objects. PyGILState_Ensure only works for the main
interpreter and causes memory corruption with subinterpreter objects.
2. thread_worker_set_coordinator: Reset has_handler=false on all existing
workers when a new coordinator registers (e.g., after app restart).
Old workers kept has_handler=true but their handler processes were dead.
* Unify erlang Python module with callback and event loop API
- Rename priv/erlang/ to priv/_erlang_impl/ to avoid C module shadowing
- Add _extend_erlang_module() helper in py_callback.c to re-export
Python package functions (run, new_event_loop, EventLoopPolicy, etc.)
- Update py_event_loop.erl to call extension during initialization
- Delete buggy erlang_asyncio.py (blocking sleep replaced by proper
asyncio.sleep backed by Erlang timers via call_later)
- Add test infrastructure in priv/tests/ for event loop integration
The unified erlang module now provides uvloop-compatible API:
- erlang.run(coro) - run async code with Erlang event loop
- erlang.new_event_loop() - create ErlangEventLoop instance
- erlang.install() - install ErlangEventLoopPolicy (deprecated 3.12+)
- erlang.call() / erlang.async_call() - call Erlang functions
- asyncio.sleep() works via Erlang timers
* Fix tests to use erlang.run() instead of removed erlang_asyncio module
- Update py_erlang_sleep_SUITE to use erlang.run() with standard asyncio
instead of the removed erlang_asyncio module
- Skip py_asyncio_compat_SUITE: tests create standalone ErlangEventLoop
instances via erlang.new_event_loop() and call loop.run_forever().
Timer scheduling for standalone loops needs work - timers fire
immediately instead of after the scheduled delay.
* Fix timer scheduling for standalone ErlangEventLoop instances
- Add isolated parameter to ErlangEventLoop.__init__() that creates
a per-loop capsule via _loop_new() for proper event routing
- Update all loop methods (call_at, _run_once, stop, close, add_reader,
remove_reader, add_writer, remove_writer) to use per-loop capsule APIs
when running as isolated instance
- new_event_loop() now passes isolated=True by default
- Fix run_forever() to honor stop() called before run_forever() by not
resetting _stopping flag at start
- Simplify async_test_runner to run tests synchronously without
erlang.run() wrapper, avoiding nested event loop issues
- Add timeout fallback to test_add_remove_writer to prevent hanging
- Remove skip from py_asyncio_compat_SUITE to enable tests
Test results: 46 tests run, 42 passed, 4 failures (edge cases)
* Replace async worker pthread backend with event loop model
The pthread+usleep polling async workers have been replaced with an
event-driven model using py_event_loop and enif_select:
- Add _run_and_send wrapper in Python for result delivery via erlang.send()
- Add nif_event_loop_run_async NIF for direct coroutine submission
- Add py_event_loop:run_async/2 Erlang API
- Add py_event_loop_pool.erl for managing event loop-based async execution
- Rewrite py_async_pool.erl to delegate to event_loop_pool
- Update supervisor tree to include py_event_loop_pool
- Remove py_async_worker.erl and py_async_worker_sup.erl
- Stub deprecated async_worker NIFs to return errors
- Remove async_event_loop_thread and async_future_callback C code
Performance improvements:
- Latency: ~10-20ms polling -> <1ms (enif_select)
- CPU idle: 100 wakeups/sec -> Zero
- Threads: N pthreads -> 0 extra threads
API unchanged: py:async_call/3,4 and py:await/1,2 work the same.
* Remove global state from py_event_loop.c for per-interpreter isolation
Replace global variables with module state structure stored in the
Python module, enabling proper per-interpreter/per-context event
loop isolation.
Changes:
- Add py_event_loop_module_state_t struct containing event_loop,
shared_router, shared_router_valid, and isolation_mode
- Update PyModuleDef to allocate module state (m_size)
- Update get_interpreter_event_loop() to read from module state
- Update set_interpreter_event_loop() to write to module state
- Update nif_set_python_event_loop() to use module state
- Update nif_set_isolation_mode() to use module state
- Update nif_set_shared_router() to use module state
- Update py_get_isolation_mode() to read from module state
- Update py_loop_new() to read shared_router from module state
- Update event_loop_destructor() to clear module state
- Update create_default_event_loop() to use module state
- Remove g_python_event_loop, g_shared_router, g_shared_router_valid,
and g_isolation_mode global variables
* Fix py_asyncio_compat_SUITE tests and consolidate erlang module
- Remove erlang_loop.py, use _erlang_impl as the single implementation
- Add get_event_loop_policy() export to _erlang_impl and erlang module
- Fix signal tests: ErlangEventLoop has limited signal support (SIGINT,
SIGTERM, SIGHUP only), other signals raise ValueError
- Skip subprocess tests for Erlang (not yet implemented)
- Update all imports to use erlang module (public API) with _erlang_impl
as internal fallback
- Update docs and examples to use erlang module imports
* Fix unawaited coroutine warnings in tests
- test_run_until_complete_nested_raises: Use asyncio.sleep(0.1) to ensure
timer path (not fast path), properly close coroutine in finally block
- test_run_until_complete_on_closed_raises: Store coroutine in variable
and close it in finally block
- tearDown: Cancel pending tasks and shutdown async generators before
closing loop to prevent resource leaks
- Add test_asyncio_sleep_zero_fast_path: Verify sleep(0) uses fast path
- test_add_remove_writer: Use socketpair for reliable write readiness
* Fix FD stealing and UDP connected socket issues
- Share fd_resource per fd to prevent enif_select stealing errors
- Add NIF functions for fd resource management
- Use send() instead of sendto() for connected UDP sockets
- Fix TCP EOF handling to call connection_lost properly
* Fix context test expectations for Python contextvars behavior
await coro() runs in shared context (changes visible to caller),
while create_task(coro()) runs in copied context (changes isolated).
Updated test_context_in_task and test_multiple_context_vars to
reflect correct Python behavior.
* Remove subprocess support from ErlangEventLoop
Subprocess is not supported because Python's subprocess module uses
fork() which corrupts the Erlang VM when called from within the NIF.
Users should use Erlang ports directly via erlang.call() instead,
which provides superior subprocess management with built-in
supervision, monitoring, and fault tolerance.
Changes:
- Replace _subprocess.py with NotImplementedError stub and docs
- Remove subprocess event handling from _loop.py
- Remove subprocess functions from py_event_loop.c
- Update tests to verify NotImplementedError is raised
- Set HAS_SUBPROCESS_SUPPORT = False in test base
* Add ETF encoding for pids/refs and fix executor/socket tests
ETF encoding for pids and references:
- Add decode_etf_string() helper in py_callback.c to convert
__etf__:base64 encoded strings back to Erlang terms
- Add ETF encoding in term_to_python_repr for pids and refs
in py_context.erl and py_thread_handler.erl
Test fixes:
- Skip ProcessPoolExecutor test inside Erlang NIF (fork issues)
- Use 'spawn' multiprocessing context instead of 'fork'
- Accept OSError in addition to TimeoutError for connect timeout test
Cleanup:
- Remove obsolete multi_loop test files
* Add erlang.reactor module for fd-based protocol handling
Implement low-level fd-based API where Erlang handles I/O scheduling
via enif_select and Python handles protocol logic.
- Add priv/_erlang_impl/_reactor.py with Protocol base class and registry
- Add src/py_reactor_context.erl for Erlang reactor context process
- Expose erlang.reactor via sys.modules for 'import erlang.reactor' syntax
- Add test suite (py_reactor_SUITE.erl) with 6 tests
- Add Python tests (py_test_reactor.py) with 3 tests
- Add examples/reactor_echo.erl as usage example
Works with any fd - TCP, UDP, Unix sockets, pipes, etc.
* Add audit hook sandbox and remove signal support
- Add _sandbox.py with Python audit hooks (PEP 578) to block dangerous
operations: fork, exec, spawn, subprocess, os.system, os.popen
- Install sandbox automatically when running inside Erlang VM
- Remove signal handling support (not applicable in Erlang context)
- Update policy to always return ErlangEventLoop
- Fix ExecutionMode test to check correct enum values
- Remove signal tests and AIO subprocess tests from test suite
* Update CHANGELOG for unreleased changes since 1.8.1
* Add security and reactor documentation, update asyncio docs
New documentation:
- docs/security.md: Document audit hook sandbox, blocked operations
(fork, exec, subprocess), and Erlang port alternatives
- docs/reactor.md: Document erlang.reactor module for FD-based
protocol handling with Protocol base class and examples
Updated documentation:
- docs/asyncio.md: Update for unified erlang module, mark
erlang.install() as deprecated in 3.12+, add Limitations section
for subprocess/signal handling, add ExecutionMode documentation
- docs/getting-started.md: Add Security Considerations section,
update asyncio section to use erlang.run()
- README.md: Add security sandbox to features, add doc links
Also fixed edoc errors in source files:
- src/py_nif.erl: Fix angle bracket syntax in reactor function docs
- src/py_context_router.erl: Replace markdown code blocks with <pre>
* Rename call_async to cast and add benchmark
API change: py:call_async/3,4 renamed to py:cast/3,4 following
gen_server convention (call=sync, cast=async).
Add benchmark_compare.erl for comparing performance between versions.
Current version shows ~2-3x improvement over v1.8.1:
- Sync calls: 0.011ms -> 0.004ms (2.9x faster)
- Cast single: 0.011ms -> 0.004ms (2.8x faster)
- Throughput: ~90K -> ~250K calls/sec
* Add migration guide for v1.8.x to v2.0
Covers:
- py:call_async -> py:cast rename
- py:bind/unbind removal (use py_context_router)
- py:ctx_* removal (use py_context directly)
- erlang_asyncio -> erlang module consolidation
- Subprocess removal (use Erlang ports)
- Signal handler removal (use Erlang level)
- New features: context router, reactor, erlang.send()
- Performance comparison table
* Add subinterpreter event loop isolation
Each subinterpreter context now gets its own event worker for asyncio
support. This ensures asyncio.sleep() and timers work correctly in
subinterpreter contexts.
Changes:
- Add nif_context_get_event_loop/1 NIF to retrieve event loop reference
- Create dedicated event worker per subinterpreter context in py_context
- Extend erlang module with run/new_event_loop in each subinterpreter
- Handle EXIT signals properly (shutdown from supervisor vs normal exits)
- Initialize event loop for worker pool subinterpreters
Worker mode contexts (Python < 3.12) continue to use the shared router.
* Skip tests incompatible with subinterpreters
test_memory_stats and test_reload use modules (tracemalloc) that don't
support Python subinterpreters. Skip these tests when running with
subinterpreter support enabled (Python 3.12+).
* Add OWN_GIL subinterpreter support for true Python parallelism
Subinterpreters with PyInterpreterConfig_OWN_GIL run in dedicated threads,
each with its own GIL, enabling true parallel Python execution on Python 3.12+.
Key changes:
- Thread pool manages subinterpreter lifecycle and context switching
- Atomic state machine for thread-safe subinterpreter state management
- Support blocking callbacks in thread-model subinterpreters
- ProcessError exception class lookup for correct identity in subinterpreters
- Test adjustments for subinterpreter path isolation and error messages
* Fix ASan/TSan LD_PRELOAD in CI
* Simplify CI: ASan only, fix LD_PRELOAD for all steps
* Remove debug counters step from ASan builds
* Document OWN_GIL subinterpreter parallelism
* Fix py_async_e2e_SUITE for subinterpreters
- Use explicit context for all tests
- Relax timing constraint (0.3s instead of 0.15s) for CI
- Add diagnostic messages to assertions1 parent 5cf256c commit 2a009e8
File tree
106 files changed
+25448
-5589
lines changed- .github/workflows
- benchmark_results
- c_src
- docs
- examples
- priv
- _erlang_impl
- tests
- scripts
- src
- test
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
106 files changed
+25448
-5589
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
186 | 186 | | |
187 | 187 | | |
188 | 188 | | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
189 | 241 | | |
190 | 242 | | |
191 | 243 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
7 | 28 | | |
8 | 29 | | |
9 | 30 | | |
| |||
16 | 37 | | |
17 | 38 | | |
18 | 39 | | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
19 | 59 | | |
20 | 60 | | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
21 | 84 | | |
22 | 85 | | |
23 | 86 | | |
24 | 87 | | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
25 | 145 | | |
26 | 146 | | |
27 | 147 | | |
| |||
102 | 222 | | |
103 | 223 | | |
104 | 224 | | |
105 | | - | |
| 225 | + | |
106 | 226 | | |
107 | 227 | | |
108 | 228 | | |
109 | 229 | | |
110 | | - | |
111 | | - | |
112 | | - | |
113 | | - | |
114 | | - | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
115 | 234 | | |
116 | 235 | | |
117 | 236 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
35 | 36 | | |
36 | 37 | | |
37 | 38 | | |
| |||
66 | 67 | | |
67 | 68 | | |
68 | 69 | | |
69 | | - | |
| 70 | + | |
70 | 71 | | |
71 | 72 | | |
72 | 73 | | |
| |||
443 | 444 | | |
444 | 445 | | |
445 | 446 | | |
446 | | - | |
| 447 | + | |
447 | 448 | | |
448 | 449 | | |
449 | 450 | | |
| |||
573 | 574 | | |
574 | 575 | | |
575 | 576 | | |
| 577 | + | |
| 578 | + | |
576 | 579 | | |
577 | 580 | | |
578 | 581 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
0 commit comments