- Context thread affinity - Contexts in MULTI_EXECUTOR mode are now assigned a fixed executor thread at creation. All operations (call, eval, exec) from the same context run on the same OS thread, preventing thread state corruption in libraries like numpy and PyTorch that have thread-local state.
-
py:execution_mode/0now returns actual mode - Returnsworker(default),owngil,free_threaded, ormulti_executorbased on actual configuration instead of Python capability. Previously returnedsubinterpeven when using worker mode. -
Removed obsolete subinterp test references - Test suites updated to reflect the removal of subinterpreter mode. Tests now use
workerorowngilmodes.
- Executor affinity for numpy/torch - Workers are now assigned a fixed executor thread at creation. All calls from the same worker go to the same executor, preventing thread state corruption in libraries like numpy and PyTorch that have thread-local state. Fixes segfaults when using sentence-transformers or other ML libraries.
- ASGI/WSGI Support - The
py_asgiandpy_wsgimodules have been removedpy_asgi:run/4,5- ASGI application runnerpy_wsgi:run/3,4- WSGI application runner- For web framework integration, use
py:callwith event loop contexts or the Channel API - See Migration Guide for alternatives
- SharedDict - Process-scoped shared dictionaries for cross-process state
py:shared_dict_new/0- Create a new SharedDictpy:shared_dict_get/2,3- Get value with optional defaultpy:shared_dict_set/3- Set key-value pairpy:shared_dict_del/2- Delete a keypy:shared_dict_keys/1- List all keyspy:shared_dict_destroy/1- Explicit cleanup- Python access via
erlang.SharedDictwith dict-like interface - Mutex-protected for concurrent access (~300k ops/sec)
- Pickle serialization for complex types
- See SharedDict documentation for details
-
OWN_GIL Mode - True parallel Python execution with Python 3.14+ subinterpreters
- Each subinterpreter runs with its own GIL (
Py_GIL_OWN) in a dedicated thread - Full isolation between interpreters (separate namespaces, modules, state)
py_context:start_link(N, owngil)to create OWN_GIL contexts- Enables true parallelism for CPU-bound Python workloads
- See OWN_GIL Internals for architecture details
- Each subinterpreter runs with its own GIL (
-
Process-Bound Python Environments - Per-Erlang-process Python namespaces
- Each Erlang process gets isolated Python globals/locals
- State persists across calls within the same process
- Automatic cleanup when Erlang process terminates
- See Process-Bound Environments for details
-
Event Loop Pool - Process affinity for parallel async execution
py_event_loop_pooldistributes async tasks across multiple event loops- Scheduler-affinity routing for cache-friendly execution
- Supports worker, subinterp, and owngil modes
-
ByteChannel API - Raw byte streaming without term serialization
py_byte_channel:new/0,1- Create byte channelspy_byte_channel:send/2- Send raw bytespy_byte_channel:recv/1,2- Receive bytes- Python
ByteChannelclass with sync/async iteration - Ideal for HTTP bodies, file streaming, binary protocols
-
PyBuffer API - Zero-copy buffer for streaming input
py_buffer:new/0,1- Create buffers with optional max sizepy_buffer:write/2- Write data to buffer- Python
PyBufferclass with file-like interface (read,readline,readlines) - Non-blocking reads for async I/O patterns
- See Buffer API for details
-
True streaming API - New
py:stream_start/3,4andpy:stream_cancel/1functions for event-driven streaming from Python generators. Unlikepy:stream/3,4which collects all values at once,stream_startsends{py_stream, Ref, {data, Value}}messages as values are yielded. Supports both sync and async generators. Useful for LLM token streaming, real-time data feeds, and processing large sequences incrementally. -
erlang.whereis(name)- Lookup registered Erlang process PIDs from Python- Returns
erlang.Pidobject orNoneif not registered - Enables Python code to discover and message named processes
- Returns
-
erlang.schedule_inline(callback)- Inline continuation scheduling- Release dirty scheduler and continue with callback in same context
- Preserves globals/locals across the continuation
- Useful for cooperative long-running tasks
-
py:spawn_call/3,4,5- Fire-and-forget with result delivery- Executes Python call asynchronously
- Sends
{py_result, Ref, Result}to caller when complete - Non-blocking alternative to
py:callfor async patterns
-
Explicit bytes conversion -
{bytes, Binary}tuple for round-trip safety- Erlang binaries convert to Python
strby default - Use
{bytes, Binary}to force Pythonbytestype - Ensures correct handling for binary protocols
- Erlang binaries convert to Python
-
Import caching API - Lazy module import with caching
py:import/1,2- Import and cache modulespy:add_import/1,2- Register imports applied to all contextspy:add_path/1- Add to sys.path across all contexts- Per-interpreter caching with generation tracking
-
Per-interpreter preload code - Execute code in new interpreters
- Configure via
{erlang_python, [{preload_code, <<"import mylib">>}]} - Code runs with inherited globals from main interpreter
- Useful for initializing common imports/state
- Configure via
-
Channel notification for create_task - Fixed async channel receive hanging when using
py_event_loop:create_task. Theevent_loop_add_pending()now sendstask_readyto the worker, not justpthread_cond_signal. Also fixed Python 3.9 compatibility in ByteChannel (Optional[bytes]instead ofbytes | None) -
Channel waiter race condition - Fixed
waiter_existserrors during fast async iteration. Waiter state is now cleared before releasing mutex, preventing race where callback fires beforechannel_sendclearshas_waiter -
Event Loop Isolation and Resource Safety - Three fixes for event loop and atom handling
- Single-loop-per-interpreter enforcement - Prevents multiple
ErlangEventLoopinstances from causing event confusion. Added_has_loop_ref()check that detects running loops; attempting to create a second loop while one is running raisesRuntimeError - Atom creation safety - Added Python-level caching with configurable limit (10000 default,
ERLANG_PYTHON_MAX_ATOMSenv var) to prevent BEAM atom table exhaustion from untrusted code. Theerlang.atom()API now goes through the cached wrapper; internal_atom()NIF still available - Global capsule resource leak - Added
global_loop_capsule_destructorthat properly callsenif_release_resource()when capsule is garbage collected. Previously NULL destructor caused reference leaks on eachErlangEventLoopcreation
- Single-loop-per-interpreter enforcement - Prevents multiple
-
Python 3.14 venv activation - Fixed
.pthfile processing in subinterpreters. Python 3.14 stricter module isolation preventedsys._venv_site_packagesfrom persisting across eval/exec calls. Now embeds site-packages path directly in the exec code string -
OWN_GIL Safety Fixes - Critical fixes for OWN_GIL subinterpreter mode
- Mutex leak in erlang module -
async_futures_mutexnow always destroyed inerlang_module_free()regardless ofpipe_initializedflag - ABBA deadlock prevention - Fixed lock ordering in
event_loop_down()andevent_loop_destructor()to acquire GIL beforenamespaces_mutex, matching the normal execution path and preventing deadlocks - Dangling env pointer detection - Added
interp_idvalidation inowngil_execute_*_with_env()functions to detect and reject env resources created by a different interpreter, returning{error, env_wrong_interpreter} - OWN_GIL callback documentation - Documented that
erlang.call()from OWN_GIL contexts usesthread_worker_call()rather than suspension/resume protocol; re-entrant calls to the same OWN_GIL context are not supported
- Mutex leak in erlang module -
-
py:castis now fire-and-forget -py:cast/3,4,5no longer returns a reference. For async calls with result delivery, use the newpy:spawn_call/3,4,5instead. -
OWN_GIL requires Python 3.14+ - The OWN_GIL subinterpreter mode requires Python 3.14 or later due to C extension compatibility issues in earlier versions. Use
workerorsubinterpmodes for Python 3.12-3.13. -
Removed auto-started io pool - The io pool is no longer started automatically at application startup to reduce memory usage. Users who need a dedicated I/O pool can create one manually via
py_context_router:start_pool(io, 10, worker). The configuration optionsio_pool_sizeandio_pool_modehave been removed. -
Removed py_event_router - Removed legacy
py_event_routermodule. Thepy_event_workernow handles all event loop functionality including FD events, timers, and task processing. This simplifies the architecture by consolidating event handling into a single worker process. Thepy_nif:set_shared_router/1function has been removed. -
Config-based initialization - Import and path configuration via application environment
- Configure imports:
{erlang_python, [{imports, [{json, dumps}]}]} - Configure paths:
{erlang_python, [{paths, ["/path/to/modules"]}]} - Applied immediately to all running interpreters
- See Imports documentation for details
- Configure imports:
-
Direct NIF channel operations - Channel send/receive bypass
erlang.call()overhead for up to 1760x speedup in raw throughput benchmarks -
nif_process_ready_tasks optimization - ~15% improvement in async task processing
- Replace
asyncio.iscoroutine()withPyCoro_CheckExactC API - Use stack buffers for module/func strings
- Cache
asyncio.eventsmodule - Pool
ErlNifEnvallocations with mutex protection
- Replace
-
Async Task API - uvloop-inspired task submission from Erlang
py_event_loop:run/3,4- Blocking run of async Python functionspy_event_loop:create_task/3,4- Non-blocking task submission with referencepy_event_loop:await/1,2- Wait for task result with timeoutpy_event_loop:spawn_task/3,4- Fire-and-forget task execution- Thread-safe submission via
enif_send(works from dirty schedulers) - Message-based result delivery via
{async_result, Ref, Result} - See Async Task API docs for details
-
erlang.spawn_task(coro)- Spawn async tasks from both sync and async contexts- Works in sync code called by Erlang (where
asyncio.get_running_loop()fails) - Returns
asyncio.Taskfor optional await/cancel (fire-and-forget pattern) - Automatically wakes up the event loop in sync context
- Works in sync code called by Erlang (where
-
Explicit Scheduling API - Control dirty scheduler release from Python
erlang.schedule(callback, *args)- Release scheduler, continue via Erlang callbackerlang.schedule_py(module, func, args, kwargs)- Release scheduler, continue in Pythonerlang.consume_time_slice(percent)- Check if NIF time slice exhaustedScheduleMarkertype for cooperative long-running tasks- See Scheduling API docs
-
Distributed Python Execution - Documentation and Docker demo
- Run Python across Erlang nodes using
rpc:call - Docker Compose setup for testing distributed patterns
- See Distributed Execution docs
- Run Python across Erlang nodes using
- Event Loop Performance Optimizations
- Growable pending queue with capacity doubling (256 to 16384)
- Snapshot-detach pattern to reduce mutex contention
- Callable cache (64 slots) avoids PyImport/GetAttr per task
- Task wakeup coalescing with atomic flag
- Drain-until-empty loop for faster task processing
ensure_venvnow always installs dependencies, even if venv existserlang.sleep()timing in sync contexttime()returns fresh value when loop not running- Handle pooling bugs in ErlangEventLoop
- Task wakeup race causing batch task stalls
-
Virtual Environment Management - Automatic venv creation and activation
py:ensure_venv/2,3- Create venv if missing, then activate- Automatically detects Python executable
- Supports pip install of dependencies
-
File Descriptor Duplication - Safe socket handoff from Erlang to Python
py:dup_fd/1- Duplicate fd for independent ownership- Prevents double-close issues when passing sockets to Python reactor
-
Custom Pool Support - Create pools on demand for CPU-bound and I/O-bound operations
defaultpool - Automatically started, sized to number of schedulerspy_context_router:start_pool/2,3- Start named pools programmaticallypy_context_router:stop_pool/1- Stop a named poolpy_context_router:pool_started/1- Check if a pool is runningpy_context_router:get_context(Pool)- Get context from a named poolpy_context_router:num_contexts(Pool)- Get pool sizepy_context_router:contexts(Pool)- Get all contexts in a poolpy_context_router:lookup_pool(Module, Func)- Query pool routingpy:call(PoolName, Module, Func, Args)- Execute on a specific pool- Registration-based routing (no call site changes needed):
py:register_pool(io, requests)- Route allrequests.*calls to io poolpy:register_pool(io, {aiohttp, get})- Route specific function to io poolpy:unregister_pool(Module)- Remove module registrationpy:unregister_pool({Module, Func})- Remove function registration- Automatic routing:
py:call(requests, get, [Url])goes to io pool when registered
- Backward compatible: existing code using
py:call/3,4,5works unchanged - New test suite:
test/py_pool_SUITE.erl
-
Channel API - Bidirectional message passing between Erlang and Python
py_channel:new/0,1- Create channels with optional backpressure (max_size)py_channel:send/2- Send Erlang terms to Python (returnsbusyon backpressure)py_channel:close/1- Close channel, signalsStopIterationto Python- Python
Channelclass with sync and async interfaces:channel.receive()- Blocking receive (suspends Python, yields to Erlang)channel.try_receive()- Non-blocking receiveawait channel.async_receive()- Asyncio-compatible receivefor msg in channel:- Sync iterationasync for msg in channel:- Async iteration
erlang.channel.reply(pid, term)- Send messages to Erlang processes- Zero-copy IOQueue buffering via
enif_ioq - 8x faster than Reactor for small messages, 2x faster for 16KB messages
-
OWN_GIL Subinterpreter Thread Pool - True parallelism with Python 3.12+ subinterpreters
- Each subinterpreter runs in its own thread with its own GIL (
Py_GIL_OWN) - Thread pool manages N subinterpreters for parallel Python execution
py:context(N)returns the Nth context PID for explicit context selectionpy_context_routerprovides scheduler-affinity routing for automatic distribution- Cast operations are 25-30% faster compared to worker mode
- Full isolation between subinterpreters (separate namespaces, modules, state)
- New C files:
py_subinterp_pool.c,py_subinterp_pool.h
- Each subinterpreter runs in its own thread with its own GIL (
-
erlang.reactormodule - FD-based protocol handling for building custom serversreactor.Protocol- Base class for implementing protocolsreactor.serve(sock, protocol_factory)- Serve connections using a protocolreactor.run_fd(fd, protocol_factory)- Handle a single FD with a protocol- Integrates with Erlang's
enif_selectfor efficient I/O multiplexing - Zero-copy buffer management for high-throughput scenarios
- Supports SHARED_GIL subinterpreters via
py_reactor_context - Each reactor context has isolated protocol factory when using
mode=subinterp
-
ETF encoding for PIDs and References - Full Erlang term format support
- Erlang PIDs encode/decode properly in ETF binary format
- Erlang References encode/decode properly in ETF binary format
- Enables proper serialization for distributed Erlang communication
-
PID serialization - Erlang PIDs now convert to
erlang.Pidobjects in Python and back to real PIDs when returned to Erlang. Previously, PIDs fell through toNone(Erlang→Python) or string representation (Python→Erlang). -
erlang.send(pid, term)- Fire-and-forget message passing from Python to Erlang processes. Usesenif_send()directly with no suspension or blocking. Raiseserlang.ProcessErrorif the target process is dead. -
erlang.ProcessError- New exception for dead/unreachable process errors. Subclass ofException, so it's catchable withexcept Exceptionorexcept erlang.ProcessError. -
Audit hook sandbox - Block dangerous operations when running inside Erlang VM
- Uses Python's
sys.addaudithook()(PEP 578) for low-level blocking - Blocks:
os.fork,os.system,os.popen,os.exec*,os.spawn*,subprocess.Popen - Raises
RuntimeErrorwith clear message about using Erlang ports instead - Automatically installed when
py_event_loopNIF is available
- Uses Python's
-
Process-per-context architecture - Each Python context runs in dedicated process
py_context_process- Gen_server managing a single Python contextpy_context_sup- Supervisor for context processespy_context_router- Routes calls to appropriate context process- Improved isolation between contexts
- Better crash recovery and resource management
-
Worker thread pool - High-throughput Python operations
- Configurable pool size for parallel execution
- Efficient work distribution across threads
-
py:contexts_started/0- Helper to check if contexts are ready
-
py:call_asyncrenamed topy:cast- Follows gen_server convention wherecallis synchronous andcastis asynchronous. The semantics are identical, only the name changed. -
Unified
erlangPython module - Consolidated callback and event loop APIserlang.run(coro)- Run coroutine with ErlangEventLoop (like uvloop.run)erlang.new_event_loop()- Create new ErlangEventLoop instanceerlang.install()- Install ErlangEventLoopPolicy (deprecated in 3.12+)erlang.EventLoopPolicy- Alias for ErlangEventLoopPolicy- Removed separate
erlang_asynciomodule - all functionality now inerlang
-
Async worker backend replaced with event loop model - The pthread+usleep polling async workers have been replaced with an event-driven model using
py_event_loopandenif_select:- Removed
py_async_worker.erlandpy_async_worker_sup.erl - Removed
py_async_worker_tandasync_pending_tstructs from C code - Deprecated
async_worker_new,async_call,async_gather,async_streamNIFs - Added
py_event_loop_pool.erlfor managing event loop-based async execution - Added
py_event_loop:run_async/2for submitting coroutines to event loops - Added
nif_event_loop_run_asyncNIF for direct coroutine submission - Added
_run_and_sendwrapper in Python for result delivery viaerlang.send() - Internal change:
py:async_call/3,4andpy:await/1,2API unchanged
- Removed
-
SuspensionRequiredbase class - Now inherits fromBaseExceptioninstead ofException. This prevents ASGI/WSGI middlewareexcept Exceptionhandlers from intercepting the suspension control flow used byerlang.call(). -
Per-interpreter isolation in py_event_loop.c - Removed global state for proper subinterpreter support. Each interpreter now has isolated event loop state.
-
ErlangEventLoopPolicy always returns ErlangEventLoop - Previously only returned ErlangEventLoop for main thread; now consistent across all threads.
-
py_asgimodule - Deprecated in favor of the Channel API (py_channel) or Reactor API (erlang.reactor). The module still works but will be removed in a future release. -
py_wsgimodule - Deprecated in favor of the Channel API (py_channel) or Reactor API (erlang.reactor). The module still works but will be removed in a future release.
-
Context affinity functions - Removed
py:bind,py:unbind,py:is_bound,py:with_context, andpy:ctx_*functions. The newpy_context_routerprovides automatic scheduler-affinity routing. For explicit context control, usepy_context_router:bind_context/1andpy_context:call/5. -
Signal handling support - Removed
add_signal_handler/remove_signal_handlerfrom ErlangEventLoop. Signal handling should be done at the Erlang VM level. Methods now raiseNotImplementedErrorwith guidance. -
Subprocess support - ErlangEventLoop raises
NotImplementedErrorforsubprocess_shellandsubprocess_exec. Use Erlang ports (open_port/2) for subprocess management instead.
-
py_reactor_contextnow extends erlang module in subinterpreters - Previously,py_reactor_contextwithmode=subinterpwould fail to importerlang.reactorbecause the erlang module extension was not applied. Now callspy_context:extend_erlang_module_in_context/1after context creation. -
FD stealing and UDP connected socket issues - Fixed file descriptor handling for UDP sockets in connected mode
-
Context test expectations - Updated tests for Python contextvars behavior
-
Unawaited coroutine warnings - Fixed warnings in test suite
-
Timer scheduling for standalone ErlangEventLoop - Fixed timer callbacks not firing for loops created outside the main event loop infrastructure
-
Subinterpreter cleanup and thread worker re-registration - Fixed cleanup issues when subinterpreters are destroyed and recreated
-
ProcessError exception class identity in subinterpreters - Fixed exception class mismatch when raising
erlang.ProcessErrorin subinterpreter contexts. The exception class is now looked up from the current interpreter'serlangmodule at runtime instead of using a global variable. -
Thread worker handlers not re-registering after app restart - Workers now properly re-register when application restarts
-
Timeout handling - Improved timeout handling across the codebase
-
Eval locals_term initialization - Fixed uninitialized variable in eval
-
Two race conditions in worker pool - Fixed concurrent access issues
-
activate_venv/1now processes.pthfiles - Usessite.addsitedir()instead ofsys.path.insert()so that editable installs (uv, pip -e, poetry) work correctly. New paths are moved to the front ofsys.pathfor proper priority. -
deactivate_venv/0now restoressys.path- The previous implementation usedpy:evalwith semicolon-separated statements which silently failed (eval only accepts expressions). Switched topy:execfor correct statement execution.
- Async coroutine latency reduced from ~10-20ms to <1ms - The event loop model eliminates pthread polling overhead
- Zero CPU usage when idle - Event-driven instead of usleep-based polling
- No extra threads - Coroutines run on the existing event loop infrastructure
- ASGI scope caching bug - HTTP method was not treated as a dynamic field in the scope template cache. This caused incorrect method values when the same path was accessed with different HTTP methods (e.g., GET /path followed by POST /path would return method="GET" for both requests).
-
ASGI NIF Optimizations - Six optimizations for high-performance ASGI request handling
- Direct Response Tuple Extraction - Extract
(status, headers, body)directly without generic conversion - Pre-Interned Header Names - 16 common HTTP headers cached as PyBytes objects
- Cached Status Code Integers - 14 common HTTP status codes cached as PyLong objects
- Zero-Copy Request Body - Large bodies (≥1KB) use buffer protocol for zero-copy access
- Scope Template Caching - Thread-local cache of 64 scope templates keyed by path hash
- Lazy Header Conversion - Headers converted on-demand for requests with ≥4 headers
- Direct Response Tuple Extraction - Extract
-
erlang_asyncio Module - Asyncio-compatible primitives using Erlang's native scheduler
erlang_asyncio.sleep(delay, result=None)- Sleep using Erlang'serlang:send_after/3erlang_asyncio.run(coro)- Run coroutine with ErlangEventLooperlang_asyncio.gather(*coros)- Run coroutines concurrentlyerlang_asyncio.wait_for(coro, timeout)- Wait with timeouterlang_asyncio.wait(fs, timeout, return_when)- Wait for multiple futureserlang_asyncio.create_task(coro)- Create background taskerlang_asyncio.ensure_future(coro)- Wrap coroutine in Futureerlang_asyncio.shield(arg)- Protect from cancellationerlang_asyncio.timeout- Context manager for timeouts- Event loop functions:
get_event_loop(),new_event_loop(),set_event_loop(),get_running_loop() - Re-exports:
TimeoutError,CancelledError,ALL_COMPLETED,FIRST_COMPLETED,FIRST_EXCEPTION
-
Erlang Sleep NIF - Synchronous sleep primitive for Python
py_event_loop._erlang_sleep(delay_ms)- Sleep using Erlang timer- Releases GIL during sleep, no Python event loop overhead
- Uses pthread condition variables for efficient blocking
py_nif:dispatch_sleep_complete/2- NIF to signal sleep completion
-
Scalable I/O Model - Worker-per-context architecture
py_event_worker- Dedicated worker process per Python context- Combined FD event dispatch and reselect via
handle_fd_event_and_reselectNIF - Sleep tracking with
sleepsmap in worker state
-
New Test Suite -
test/py_erlang_sleep_SUITE.erlwith 8 teststest_erlang_sleep_available- Verify NIF is exposedtest_erlang_sleep_basic- Basic functionalitytest_erlang_sleep_zero- Zero delay returns immediatelytest_erlang_sleep_accuracy- Timing accuracytest_erlang_asyncio_module- Module functions presenttest_erlang_asyncio_gather- Concurrent executiontest_erlang_asyncio_wait_for- Timeout supporttest_erlang_asyncio_create_task- Background tasks
- ASGI marshalling optimizations - 40-60% improvement for typical ASGI workloads
- Direct response extraction: 5-10% improvement
- Pre-interned headers: 3-5% improvement
- Cached status codes: 1-2% improvement
- Zero-copy body buffers: 10-15% for large bodies (≥1KB)
- Scope template caching: 15-20% for repeated paths
- Lazy header conversion: 5-10% for apps accessing few headers
- Eliminates event loop overhead for sleep operations (~0.5-1ms saved per call)
- Sub-millisecond timer precision via BEAM scheduler (vs 10ms asyncio polling)
- Zero CPU when idle - event-driven, no polling
- Hex package missing priv directory - Added explicit
filesconfiguration to includepriv/erlang_loop.pyand other necessary files in the hex.pm package
-
Shared Router Architecture for Event Loops
- Single
py_event_routerprocess handles all event loops - Timer and FD messages include loop identity for correct dispatch
- Eliminates need for per-loop router processes
- Handle-based Python C API using PyCapsule for loop references
- Single
-
Per-Loop Capsule Architecture - Each
ErlangEventLoopinstance has its own isolated capsule- Dedicated pending queue per loop for proper event routing
- Full asyncio support (timers, FD operations) with correct loop isolation
- Safe for multi-threaded Python applications where each thread needs its own loop
- See
docs/asyncio.mdfor usage and architecture details
- ASGI headers now correctly use bytes instead of str - Fixed ASGI spec compliance
issue where headers were being converted to Python
strobjects instead ofbytes. The ASGI specification requires headers to belist[tuple[bytes, bytes]]. This was causing authentication failures and form parsing issues with frameworks like Starlette and FastAPI, which search for headers using bytes keys (e.g.,b"content-type").- Added explicit header handling in
asgi_scope_from_map()to bypass generic conversion - Headers are now correctly converted using
PyBytes_FromStringAndSize() - Supports both list
[name, value]and tuple{name, value}header formats from Erlang - Fixes GitHub issue #1
- Added explicit header handling in
-
Python Logging Integration - Forward Python's
loggingmodule to Erlang'sloggerpy:configure_logging/0,1- Setup Python logging to forward to Erlangerlang.ErlangHandler- Python logging handler that sends to Erlangerlang.setup_logging(level, format)- Configure logging from Python- Fire-and-forget architecture using
enif_send()for non-blocking messaging - Level filtering at NIF level for performance (skip message creation for filtered logs)
- Log metadata includes module, line number, and function name
- Thread-safe - works from any Python thread
-
Distributed Tracing - Collect trace spans from Python code
py:enable_tracing/0,py:disable_tracing/0- Enable/disable span collectionpy:get_traces/0- Retrieve collected spanspy:clear_traces/0- Clear collected spanserlang.Span(name, **attrs)- Context manager for creating spanserlang.trace(name)- Decorator for tracing functions- Span events via
span.event(name, **attrs) - Automatic parent/child span linking via thread-local storage
- Error status capture with exception details
- Duration tracking in microseconds
-
New Erlang modules
py_logger- gen_server receiving log messages from Python workerspy_tracer- gen_server collecting and managing trace spans
-
New C source
c_src/py_logging.c- NIF implementations for logging and tracing
-
Documentation and examples
docs/logging.md- Logging and tracing documentationexamples/logging_example.erl- Working escript example- Updated
docs/getting-started.mdwith logging/tracing section
-
New test suite
test/py_logging_SUITE.erl- 9 tests for logging and tracing
-
ATOM_NILfor Elixirnilcompatibility in type conversions
-
Type conversion optimizations - Faster Python ↔ Erlang marshalling
- Use
enif_is_identicalfor atom comparison instead ofstrcmp - Use
PyLong_AsLongLongAndOverflowto avoid exception machinery - Cache
numpy.ndarraytype at init for fast isinstance checks - Stack allocate small tuples/maps (≤16 elements) to avoid heap allocation
- Use
enif_make_map_from_arraysfor O(n) map building vs O(n²) puts - Reorder type checks for web workloads (strings/dicts first)
- UTF-8 decode with bytes fallback for invalid sequences
- Use
-
Fire-and-forget NIF architecture - Log and trace calls never block Python execution
- Uses
enif_send()to dispatch messages asynchronously to Erlang processes - Python code continues immediately after sending, no round-trip wait
- Uses
-
NIF-level log filtering - Messages below threshold are discarded before term creation
- Volatile bool flags for O(1) receiver availability checks
- Level threshold stored in C global, no Erlang callback needed
-
Minimal term allocation - Direct Erlang term building without intermediate structures
- Timestamps captured at NIF level using
enif_monotonic_time()
- Timestamps captured at NIF level using
- Python 3.12+ event loop thread isolation - Fixed asyncio timeouts on Python 3.12+
ErlangEventLoopnow only used for main thread; worker threads getSelectorEventLoop- Async worker threads bypass the policy to create
SelectorEventLoopdirectly - Per-call
ErlNifEnvfor thread-safe timer scheduling in free-threaded mode - Fail-fast error handling in
erlang_loop.pyinstead of silent hangs - Added
gil_acquire()/gil_release()helpers to avoid GIL double-acquisition
-
py_asgimodule - Optimized ASGI request handling with:- Pre-interned Python string keys (15+ ASGI scope keys)
- Cached constant values (http type, HTTP versions, methods, schemes)
- Thread-local response pooling (16 slots per thread, 4KB initial buffer)
- Direct NIF path bypassing generic py:call()
- ~60-80% throughput improvement over py:call()
- Configurable runner module via
runneroption - Sub-interpreter and free-threading (Python 3.13+) support
-
py_wsgimodule - Optimized WSGI request handling with:- Pre-interned WSGI environ keys
- Direct NIF path for marshalling
- ~60-80% throughput improvement over py:call()
- Sub-interpreter and free-threading support
-
Web frameworks documentation - New documentation at
docs/web-frameworks.md
-
Erlang-native asyncio event loop - Custom asyncio event loop backed by Erlang's scheduler
ErlangEventLoopclass inpriv/erlang_loop.py- Sub-millisecond latency via Erlang's
enif_select(vs 10ms polling) - Zero CPU usage when idle - no busy-waiting or polling overhead
- Full GIL release during waits for better concurrency
- Native Erlang scheduler integration for I/O events
- Event loop policy via
get_event_loop_policy()
-
TCP support for asyncio event loop
create_connection()- TCP client connectionscreate_server()- TCP server with accept loop_ErlangSocketTransport- Non-blocking socket transport with write buffering_ErlangServer- TCP server withserve_forever()support
-
UDP/datagram support for asyncio event loop
create_datagram_endpoint()- Create UDP endpoints with full parameter support_ErlangDatagramTransport- Datagram transport implementation- Parameters:
local_addr,remote_addr,reuse_address,reuse_port,allow_broadcast DatagramProtocolcallbacks:datagram_received(),error_received()- Support for both connected and unconnected UDP
- New NIF helpers:
create_test_udp_socket,sendto_test_udp,recvfrom_test_udp,set_udp_broadcast - New test suite:
test/py_udp_e2e_SUITE.erl
-
Asyncio event loop documentation
- New documentation:
docs/asyncio.md - Updated
docs/getting-started.mdwith link to asyncio documentation
- New documentation:
- Event loop optimizations
- Fixed
run_until_completecallback removal bug (was using two different lambda references) - Cached
ast.literal_evallookup at module initialization (avoids import per callback) - O(1) timer cancellation via handle-to-callback_id reverse map (was O(n) iteration)
- Detach pending queue under mutex, build Erlang terms outside lock (reduced contention)
- O(1) duplicate event detection using hash set (was O(n) linear scan)
- Added
PERF_BUILDcmake option for aggressive optimizations (-O3, LTO, -march=native)
- Fixed
- torch/PyTorch introspection compatibility - Fixed
AttributeError: 'erlang.Function' object has no attribute 'endswith'when importing torch or sentence_transformers in contexts where erlang_python callbacks are registered.- Root cause: torch does dynamic introspection during import, iterating through Python's
namespace and calling
.endswith()on objects. Theerlangmodule's__getattr__was returningErlangFunctionwrappers for any attribute access. - Solution: Added C-side callback name registry. Now
__getattr__only returnsErlangFunctionwrappers for actually registered callbacks. Unregistered attributes raiseAttributeError(normal Python behavior). - New test:
test_callback_name_registryinpy_reentrant_SUITE.erl
- Root cause: torch does dynamic introspection during import, iterating through Python's
namespace and calling
- Hex.pm packaging - Added
filessection to app.src to include build scripts (do_cmake.sh,do_build.sh) and other necessary files in the hex.pm package
- Asyncio Support - New
erlang.async_call()for asyncio-compatible callbacksawait erlang.async_call('func', arg1, arg2)- Call Erlang from async Python code- Integrates with asyncio event loop via
add_reader() - No exceptions raised for control flow (unlike
erlang.call()) - Releases dirty NIF thread while waiting (non-blocking)
- Works with FastAPI, Starlette, aiohttp, and other ASGI frameworks
- Supports concurrent calls via
asyncio.gather() - New test:
test_async_callinpy_reentrant_SUITE.erl - New test module:
test/py_test_async.py - Updated documentation:
docs/threading.md- Added Asyncio Support section
- Flag-based callback detection in replay path - Fixed SuspensionRequired exceptions
leaking when ASGI middleware catches and re-raises exceptions. The replay path in
nif_resume_callback_dirtynow uses flag-based detection (checkingtl_pending_callback) instead of exception-type detection.
- C code optimizations and refactoring
- Thread safety fixes: Used
pthread_oncefor async callback initialization, fixed mutex held during Python calls in async event loop thread - Timeout handling: Added
read_with_timeout()andread_length_prefixed_data()helpers with proper timeouts on all blocking pipe reads (30s for callbacks, 10s for spawns) - Code deduplication: Merged
create_suspended_state()andcreate_suspended_state_from_existing()into unifiedcreate_suspended_state_ex(), extractedbuild_pending_callback_exc_args()andbuild_suspended_result()helpers - Performance: Optimized list conversion using
enif_make_list_cell()to build lists directly without temporary array allocation - Removed unused
make_suspended_term()function
- Thread safety fixes: Used
-
Context Affinity - Bind Erlang processes to dedicated Python workers for state persistence
py:bind()/py:unbind()- Bind current process to a worker, preserving Python statepy:bind(new)- Create explicit context handles for multiple contexts per processpy:with_context(Fun)- Scoped helper with automatic bind/unbind- Context-aware functions:
py:ctx_call/4-6,py:ctx_eval/2-4,py:ctx_exec/2 - Automatic cleanup via process monitors when bound processes die
- O(1) ETS-based binding lookup for minimal overhead
- New test suite:
test/py_context_SUITE.erl
-
Python Thread Support - Any spawned Python thread can now call
erlang.call()without blocking- Supports
threading.Thread,concurrent.futures.ThreadPoolExecutor, and any other Python threads - Each spawned thread lazily acquires a dedicated "thread worker" channel
- One lightweight Erlang process per Python thread handles callbacks
- Automatic cleanup when Python thread exits via
pthread_key_tdestructor - New module:
py_thread_handler.erl- Coordinator and per-thread handlers - New C file:
py_thread_worker.c- Thread worker pool management - New test suite:
test/py_thread_callback_SUITE.erl - New documentation:
docs/threading.md- Threading support guide
- Supports
-
Reentrant Callbacks - Python→Erlang→Python callback chains without deadlocks
- Exception-based suspension mechanism interrupts Python execution cleanly
- Callbacks execute in separate processes to prevent worker pool exhaustion
- Supports arbitrarily deep nesting (tested up to 10+ levels)
- Transparent to users -
erlang.call()works the same, just without deadlocks - New test suite:
test/py_reentrant_SUITE.erl - New examples:
examples/reentrant_demo.erlandexamples/reentrant_demo.py
- Callback handlers now spawn separate processes for execution, allowing workers
to remain available for nested
py:eval/py:calloperations - Modular C code structure - Split monolithic
py_nif.c(4,335 lines) into logical modules for better maintainability:py_nif.h- Shared header with types, macros, and declarationspy_convert.c- Bidirectional type conversion (Python ↔ Erlang)py_exec.c- Python execution engine and GIL managementpy_callback.c- Erlang callback support and asyncio integration- Uses
#includeapproach for single compilation unit (no build changes needed)
- Multiple sequential erlang.call() - Fixed infinite loop when Python code makes
multiple sequential
erlang.call()invocations in the same function. The replay mechanism now falls back to blocking pipe behavior for subsequent calls after the first suspension, preventing the infinite replay loop. - Memory safety in C NIF - Fixed memory leaks and added NULL checks
nif_async_worker_new: msg_env now freed on pipe/thread creation failuremulti_executor_stop: shutdown requests now properly freed after joincreate_suspended_state: binary allocations cleaned up on failure paths- Added NULL checks on all
enif_alloc_resourceandenif_alloc_envcalls
- Dialyzer warnings - Added
{suspended, ...}return type to NIF specs forworker_call,worker_eval, andresume_callbackfunctions - Dead code removal - Cleaned up unused code discovered during code review:
- Removed
execute_direct()function inpy_exec.c(duplicated inline logic) - Removed unused
reffield fromasync_pending_tstruct inpy_nif.h - Removed
worker_recv/2frompy_nif.erl(declared but never implemented in C)
- Removed
- Doxygen-style C documentation - Added documentation to all C source files:
- Architecture overview with execution mode diagrams
- Type mapping tables for conversions
- GIL management patterns and best practices
- Suspension/resume flow diagrams for callbacks
- Function-level
@param,@return,@pre,@warning,@seeannotations
-
Shared State API - ETS-backed storage for sharing data between Python workers
state_set/get/delete/keys/clearaccessible from Python viafrom erlang import ...py:state_store/fetch/remove/keys/clearfrom Erlang- Atomic counters with
state_incr/decr(Python) andpy:state_incr/decr(Erlang) - New example:
examples/shared_state_example.erl
-
Native Python Import Syntax for Erlang callbacks
from erlang import my_func; my_func(args)- most Pythonicerlang.my_func(args)- attribute-style accesserlang.call('my_func', args)- legacy syntax still works
-
Module Reload - Reload Python modules across all workers during development
py:reload(module)usesimportlib.reload()to refresh modules from diskpy_pool:broadcastfor sending requests to all workers
-
Documentation improvements
- Added shared state section to getting-started, scalability, and ai-integration guides
- Added embedding caching example using shared state
- Added hex.pm badges to README
- Memory safety - Added NULL checks to all
enif_alloc()calls in NIF code - Worker resilience - Fixed crash in
py_subinterp_pool:terminatewhen workers undefined - Streaming example - Fixed to work with worker pool design (workers don't share namespace)
- ETS table ownership - Moved
py_callbackstable creation to supervisor for resilience
- Created
py_utilmodule to consolidate duplicate code (to_binary/1,send_response/3,normalize_timeout/1-2) - Consolidated
async_await/2to callawait/2reducing duplication
Initial release of erlang_python - Execute Python from Erlang/Elixir using dirty NIFs.
-
Python Integration
- Call Python functions with
py:call/3-5 - Evaluate expressions with
py:eval/1-3 - Execute statements with
py:exec/1-2 - Stream from Python generators with
py:stream/3-4
- Call Python functions with
-
Multiple Execution Modes (auto-detected)
- Free-threaded Python 3.13+ (no GIL, true parallelism)
- Sub-interpreters Python 3.12+ (per-interpreter GIL)
- Multi-executor for older Python versions
-
Worker Pools
- Main worker pool for synchronous calls
- Async worker pool for asyncio coroutines
- Sub-interpreter pool for parallel execution
-
Erlang/Elixir Callbacks
- Register functions callable from Python via
py:register_function/2-3 - Python code calls back with
erlang.call('name', args...)
- Register functions callable from Python via
-
Virtual Environment Support
- Activate venvs with
py:activate_venv/1 - Use isolated package dependencies
- Activate venvs with
-
Rate Limiting
- ETS-based semaphore prevents overload
- Configurable max concurrent operations
-
Type Conversion
- Automatic conversion between Erlang and Python types
- Integers, floats, strings, lists, tuples, maps/dicts, booleans
-
Memory Management
- Access Python GC stats with
py:memory_stats/0 - Force garbage collection with
py:gc/0-1 - Memory tracing with
py:tracemalloc_start/stop
- Access Python GC stats with
semantic_search.erl- Text embeddings and similarity searchrag_example.erl- Retrieval-Augmented Generation with Ollamaai_chat.erl- Interactive LLM chaterlang_concurrency.erl- 10x speedup with BEAM processeselixir_example.exs- Full Elixir integration demo
- Getting Started guide
- AI Integration guide
- Type Conversion reference
- Scalability and performance tuning
- Streaming with generators