OpenKL

OpenKL is a high-performance memory pooling library for accelerator-style workloads, with first-class C++ and Python APIs.

It supports CUDA when available and falls back to host-backed allocation stubs when CUDA is not present (useful for macOS/Windows development machines).

Highlights

Fixed-size MemoryPool with O(1) allocation/deallocation.
Variable-size SlabPool using configurable size classes.
Batch allocation APIs and detailed telemetry/stats.
Runtime exhaustion policies (Throw, Wait, Grow scaffold, FallbackRaw).
RAII and typed helpers (PooledPtr, allocate_t<T>()).
No-exception pathways (Status, ErrorCode, C API _ex calls).
Multi-GPU routing with affinity policies and peer-access helpers.
C API for embedding and Python bindings via pybind11.
Optional backend abstraction scaffold (host, CUDA, HIP stub).

Requirements

CMake 3.18+
C++17 compiler
Python 3.8+ (for Python package/tests)
Optional CUDA: CUDA 11+ and nvcc in PATH

Quick Start

Build (C++)

cd /path/to/OpenKL
./build.sh

This configures, builds, and runs C++ tests.

Windows helpers:

.\build_windows.ps1

build_windows.bat

Python

cd /path/to/OpenKL/python
python -m pip install -e .

or from repo root:

pip install -e ./python

Usage

C++: MemoryPool

#include "openkl/memory_pool.hpp"

openkl::MemoryPoolOptions opts;
opts.thread_safe = true;
opts.debug = true;
opts.alignment = 64;
opts.exhaustion_policy = openkl::ExhaustionPolicy::FallbackRaw;

openkl::MemoryPool pool(4096, 1024, opts);
void* ptr = pool.allocate();
pool.deallocate(ptr);

auto st = pool.stats();

C++: SlabPool

#include "openkl/slab_pool.hpp"

auto classes = openkl::SlabPool::default_classes(64, 1024 * 1024, 512);
openkl::SlabPool slab(classes);
void* p = slab.allocate(200);   // chooses best fitting class
slab.deallocate(p);

Python

import openkl

pool = openkl.MemoryPool(
    block_size=4096,
    num_blocks=1024,
    thread_safe=True,
    debug=True,
    alignment=64,
    exhaustion_policy=openkl.ExhaustionPolicy.Throw,
)

ptr = pool.allocate()
pool.deallocate(ptr)
print(pool.stats().in_use, pool.stats().free_blocks)

slab = openkl.make_default_slab(64, 1024 * 1024, 256, debug=True)
p = slab.allocate(128)
slab.deallocate(p)

C API (no-exception style)

#include "openkl/c_api.h"

openkl_pool* pool = openkl_pool_create_ex(4096, 1024, 1, 1, 64, 0);
void* ptr = NULL;
openkl_error_code ec = openkl_pool_alloc_ex(pool, &ptr);
if (ec == OPENKL_OK) {
  openkl_pool_free_ex(pool, ptr);
}
openkl_pool_destroy(pool);

API Overview

`MemoryPool`

Core methods:

allocate(), deallocate()
allocate_batch(n), deallocate_batch(ptrs)
try_allocate(out_ptr), try_deallocate(ptr)
allocate_owned()
fragmentation(), stats()
set_exhaustion_policy(), set_wait_timeout(), set_device_id()
reserve(), reset(), validate(), for_each_allocation()

Metrics helpers:

capacity_bytes(), in_use_bytes(), free_bytes()

CUDA-only methods:

allocate_async(stream)
deallocate_async(ptr, stream) (thread_safe=true required)

`SlabPool`

allocate(size), allocate_exact(size), deallocate(ptr)
default_classes(min, max, blocks_per_class)
set_compaction_policy(...)
stats(), slab_stats()
class_for_size(size), rebalance(...) (rebalance is currently scaffolded)

`MultiGPUPool`

allocate(device_id), allocate_auto(), allocate_on_best_fit(size_bytes)
deallocate(ptr, device_id), batch variants
set_affinity_policy(...), set_device_weight(...)
stats(device_id), stats_all(), stats_aggregate()
peer_access_supported(src, dst), copy_peer(...)

Error model

Exception classes for C++ high-level APIs.
Status + ErrorCode for no-exception workflows.
C API error retrieval via openkl_last_error().

Platform / Backend behavior

CUDA enabled: device allocations via CUDA runtime.
CUDA disabled: host-backed aligned allocation stubs.
Backend abstraction exists in source (backend.hpp + implementations):
- host backend
- CUDA backend
- HIP backend scaffold (stub)

Platform helpers:

get_os(), get_os_name()
cuda_available(), get_device_count(), get_device_name(device_id)

Build and Test

C++

./build.sh

Manual:

mkdir -p build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
cmake --build .
ctest --output-on-failure

Useful CMake options:

-DOPENKL_USE_CUDA=OFF
-DOPENKL_BUILD_PYTHON=OFF
-DOPENKL_BUILD_TESTS=OFF
-DOPENKL_BUILD_BENCHMARKS=OFF
-DOPENKL_BUILD_SHARED=ON

Python

PYTHONPATH=python python tests/test_all.py

or

PYTHONPATH=python python -m pytest tests/test_all.py -v

Note: use the same Python interpreter for build and test when using native extensions.

Benchmark

From build/:

./bench_alloc

When CUDA is not enabled, CUDA baseline is skipped and the pool path still runs.

Project Structure

OpenKL/
├── include/openkl/      # Public C++ headers
├── src/                 # C++ implementations
├── python/              # Python package + pybind bindings
├── tests/               # C++ and Python tests
├── benchmarks/          # Benchmarks
├── build.sh             # Unix/macOS build+test helper
├── build_windows.ps1    # PowerShell build+test helper
├── build_windows.bat    # CMD build+test helper
└── CMakeLists.txt

License

Apache-2.0. See LICENSE.

Credits

Developed by Aksel Aghajanyan — Aqwel AI.
GitHub: Aksel588

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenKL

Highlights

Requirements

Quick Start

Build (C++)

Python

Usage

C++: MemoryPool

C++: SlabPool

Python

C API (no-exception style)

API Overview

`MemoryPool`

`SlabPool`

`MultiGPUPool`

Error model

Platform / Backend behavior

Build and Test

C++

Python

Benchmark

Project Structure

License

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
.idea		.idea
Testing/Temporary		Testing/Temporary
benchmarks		benchmarks
build		build
include/openkl		include/openkl
python		python
src		src
tests		tests
.DS_Store		.DS_Store
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
COMMANDS.txt		COMMANDS.txt
LICENSE		LICENSE
README.md		README.md
build.sh		build.sh
build_windows.bat		build_windows.bat
build_windows.ps1		build_windows.ps1

Folders and files

Latest commit

History

Repository files navigation

OpenKL

Highlights

Requirements

Quick Start

Build (C++)

Python

Usage

C++: MemoryPool

C++: SlabPool

Python

C API (no-exception style)

API Overview

MemoryPool

SlabPool

MultiGPUPool

Error model

Platform / Backend behavior

Build and Test

C++

Python

Benchmark

Project Structure

License

Credits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`MemoryPool`

`SlabPool`

`MultiGPUPool`

Packages