-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
Required prerequisites
- Make sure you've read the documentation. Your issue may be addressed there.
- Search the issue tracker and Discussions to verify that this hasn't already been reported. +1 or comment there if it has.
- Consider asking first in the Gitter chat room or in a Discussion.
What version (or hash if on master) of pybind11 are you using?
3.0.2
Problem description
Summary
On Python 3.14, objects of any py::dynamic_attr() class do not properly free objects stored in their dict when they are garbage-collected. Concretely, a capsule-backed (zero-copy) py::array stored as an instance attribute
via obj.attr("x") = arr is never freed — the capsule destructor is never called, and memory grows without bound.
The same code works correctly on Python ≤ 3.13.
Environment
- pybind11: 3.0.2
- Python: 3.14.3 (conda-forge)
- numpy: 2.4.2
- OS: Linux x86_64, GCC 14.3.0
- Python 3.13.x: not affected
Minimal reproducer
pybind11_leak.cpp
/*
* Minimal pybind11 example reproducing memory leak with Python 3.14.
*
* A numpy array is created zero-copy with a capsule as base object.
* The capsule destructor should free the underlying C++ data when
* the numpy array is garbage-collected. In Python <= 3.13 this works
* correctly; in Python 3.14 the destructor is never called.
*/
#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>
#include <iostream>
#include <vector>
namespace py = pybind11;
struct Data {
std::vector<double> values;
explicit Data(std::size_t n) : values(n) {}
~Data() { std::cout << "Data deleted\n" << std::flush; }
};
// Plain capsule-backed array — works correctly on Python 3.14
py::array make_array(std::size_t n = 5) {
auto *data = new Data(n);
py::capsule base(data, [](void *ptr) { delete static_cast<Data *>(ptr); });
return py::array(py::dtype::of<double>(), {n}, {sizeof(double)},
data->values.data(), base);
}
struct Container {};
// Array stored in a py::dynamic_attr() object — leaks on Python 3.14
py::object make_container(std::size_t n = 5) {
auto *data = new Data(n);
py::capsule base(data, [](void *ptr) { delete static_cast<Data *>(ptr); });
auto arr = py::array(py::dtype::of<double>(), {n}, {sizeof(double)},
data->values.data(), base);
py::object obj = py::cast(Container{});
obj.attr("value") = arr; // store in __dict__ of dynamic_attr object
return obj;
}
PYBIND11_MODULE(pybind11_leak, m) {
m.def("make_array", &make_array, py::arg("n") = 5);
py::class_<Container>(m, "Container", py::dynamic_attr())
.def(py::init<>());
m.def("make_container", &make_container, py::arg("n") = 5);
}test_leak.py
"""
Test: numpy array created zero-copy via pybind11 capsule should free
underlying C++ data when the array goes out of scope.
Expected (Python <= 3.13):
- "Data deleted" printed after each call to use_array()
- Memory usage stays flat
Observed (Python 3.14):
- "Data deleted" never printed
- Memory grows with each iteration
"""
import gc, resource, pybind11_leak, sys, pybind11
def mem_kb():
return resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
def use_array():
arr = pybind11_leak.make_array(n=100_000) # ~800 KB
def use_container_no_load():
# No Python-side LOAD_ATTR — just store and del
container = pybind11_leak.make_container(n=100_000)
del container
def use_container():
container = pybind11_leak.make_container(n=100_000)
value = container.value # LOAD_ATTR
del container # explicit del of container
def use_dict():
arr = pybind11_leak.make_array(n=100_000)
d = {"value": arr}
del arr
value = d["value"] # LOAD from plain Python dict
del d
if __name__ == "__main__":
print(f"Python {sys.version}")
print(f"Pybind11 {pybind11.__version__}\n")
print("=== plain py::array + capsule ===")
for i in range(5):
use_array(); gc.collect()
print(f" iter {i}: mem_kb = {mem_kb()}")
print("\n=== dynamic_attr(), NO LOAD_ATTR (just del container) ===")
for i in range(5):
use_container_no_load(); gc.collect()
print(f" iter {i}: mem_kb = {mem_kb()}")
print("\n=== dynamic_attr(), LOAD_ATTR then let value go out of scope ===")
for i in range(5):
use_container(); gc.collect()
print(f" iter {i}: mem_kb = {mem_kb()}")
print("\n=== plain Python dict container, LOAD_ATTR ===")
for i in range(5):
use_dict(); gc.collect()
print(f" iter {i}: mem_kb = {mem_kb()}")Build:
c++ -O2 -std=c++17 -shared -fPIC \
$(python3-config --includes) \
-I$(python -c "import pybind11; print(pybind11.get_include())") \
pybind11_leak.cpp -o pybind11_leak$(python3-config --extension-suffix)
Output on Python 3.14 (unbuffered, python -u):
Python 3.14.3 ...
=== plain py::array + capsule ===
Data deleted
iter 0: mem_kb = 28880
Data deleted
iter 1: mem_kb = 29404
... (stable)
=== dynamic_attr(), NO LOAD_ATTR (just del container) ===
iter 0: mem_kb = 29404 ← grows ~792 KB each iter
iter 1: mem_kb = 30196
iter 2: mem_kb = 30988
iter 3: mem_kb = 31780
iter 4: mem_kb = 32572
← "Data deleted" NEVER printed, not even at script exit
=== dynamic_attr(), LOAD_ATTR then let value go out of scope ===
iter 0: mem_kb = 33364 ← same leak pattern
...
=== plain Python dict container, LOAD_ATTR ===
Data deleted
iter 0: mem_kb = 41020 ← stable, destructor called immediately
...
Key observations:
- "Data deleted" is never printed for dynamic_attr() containers — not even at program exit. The 15 capsules (3 test cases × 5 iterations) are permanently leaked.
- The leak triggers even without LOAD_ATTR — just storing arr in the object's dict then deleting the object is sufficient to reproduce it.
- Plain Python dict works correctly — so this is specific to pybind11's managed-dict implementation for py::dynamic_attr() types.
- gc.collect() does not help (no reference cycles are involved).
Root cause
pybind11_object_dealloc in pybind11/detail/class.h (line 496–513) calls type->tp_free(self) without first calling PyObject_ClearManagedDict(self):
extern "C" inline void pybind11_object_dealloc(PyObject *self) {
auto *type = Py_TYPE(self);
if (PyType_HasFeature(type, Py_TPFLAGS_HAVE_GC) != 0) {
PyObject_GC_UnTrack(self);
}
clear_instance(self);
type->tp_free(self); // ← managed dict is never cleared here on Python 3.14
Py_DECREF(type);
}When py::dynamic_attr() is enabled, pybind11 sets Py_TPFLAGS_MANAGED_DICT on the type (since Python 3.13, via enable_dynamic_attributes). The managed dict stores instance attributes — in this case, the numpy array holding a
reference to the capsule.
On Python ≤ 3.13, PyObject_GC_Del (what tp_free resolves to for GC-tracked objects) apparently cleared the managed dict as a side effect, so refcounts were decremented correctly. On Python 3.14, this implicit clearing was
removed, requiring an explicit PyObject_ClearManagedDict(self) call in tp_dealloc before tp_free. Without it, the refcounts of all objects in __dict__ are permanently abandoned.
This is consistent with the fact that pybind11_clear (which correctly calls PyObject_ClearManagedDict) exists but is only invoked by the cyclic GC (via tp_clear), not during normal tp_dealloc.
Proposed fix
In pybind11/detail/class.h, add PyObject_ClearManagedDict to pybind11_object_dealloc:
extern "C" inline void pybind11_object_dealloc(PyObject *self) {
auto *type = Py_TYPE(self);
if (PyType_HasFeature(type, Py_TPFLAGS_HAVE_GC) != 0) {
PyObject_GC_UnTrack(self);
}
#if PY_VERSION_HEX >= 0x030D0000
// On Python 3.13+, PyObject_GC_Del no longer implicitly clears the managed
// dict. Without this call, objects stored in __dict__ of py::dynamic_attr()
// types have their refcounts abandoned, causing permanent memory leaks.
if (PyType_HasFeature(type, Py_TPFLAGS_MANAGED_DICT)) {
PyObject_ClearManagedDict(self);
}
#endif
clear_instance(self);
type->tp_free(self);
Py_DECREF(type);
}PyObject_ClearManagedDict is idempotent — calling it before tp_free is safe on Python 3.13 as well (where the previous implicit clearing made it a no-op at tp_free time).
Confirmed workaround
Explicitly deleting dict entries before the container is freed avoids the leak:
container = pybind11_leak.make_container(n=100_000)
del container.value # remove from __dict__ while object is still alive
del container # now tp_dealloc has nothing to abandon
This confirms that the fix must happen inside tp_dealloc, before tp_free is called.
Reproducible example code
See above.
Is this a regression? Put the last known working version here if it is.
Not a regression