Skip to content

habedi/bptree

B+ tree

Bptree

Tests Lints Benchmarks Code Coverage Zig License Release

A B+ tree implementation in C


Bptree is a lightweight single-header B+ tree implementation written in C. It can be thought of as an in-memory ordered map where keys are stored based on the ordering defined by some comparator function, and values can be any type of data. This library started as a personal project to learn about B+ trees and their internal details. It now provides a fully functional implementation for in-memory storage with a simple API covering the core operations of a typical B+ tree.

The B+ tree is a variant of the B-tree data structure. It is a self-balancing n-ary tree (each node in the tree can have at most n children) like B-tree, but unlike B-tree, all values are stored in the leaf nodes. The internal nodes only store keys to guide the search process. B+ trees are widely used in databases and file systems due to their efficiency in providing IO-optimized data access patterns. Note that B+ trees are not suitable for random access like for building hash tables. They are mainly used for ordered data access and range queries as part of a larger system.

Features

  • Lightweight single-header C library (see bptree.h)
  • Supports numeric and string keys as well as custom value types
  • Supports insertion, deletion, as well as point and range queries
  • Forward iterator with find, lower_bound, and upper_bound
  • Allows the user to manage memory for values
  • Compatible with C11 or newer
  • Builds with Make (with a C compiler like GCC or Clang) or Zig build system

Getting Started

Download the bptree.h file and include it in your projects.

Define BPTREE_IMPLEMENTATION in one of your C source files before including the header:

// Add these lines to one of your C source files (e.g., main.c)
#define BPTREE_IMPLEMENTATION
#include "bptree.h"

See the Key and Value Configuration section below for how to customize key and value types.

Examples

File Description
example.c Example usages of the B+ tree API.

To compile and run the example(s), use the make example command.

Documentation

API documentation can be generated using Doxygen. To generate the documentation, use the make docs command and then open the docs/html/index.html file in a web browser.

API Summary

Function Return Type Description
bptree_create bptree * Creates a new B+ tree with specified max_keys, comparator function (or NULL for default), and debug flag. Returns pointer to the tree or NULL on failure.
bptree_free void Frees the tree structure and all its internal nodes. (This function does not free memory for values stored in the tree.)
bptree_put bptree_status Inserts a key-value pair. The key must not already exist in the tree.
bptree_get bptree_status Retrieves the value associated with a key via an out-parameter.
bptree_contains bool Checks if a key exists in the tree.
bptree_remove bptree_status Deletes a key-value pair. Performs node rebalancing (key borrowing and node merging) if necessary.
bptree_get_range bptree_status Gets values for keys within [start, end] (inclusive) via out-parameters for the results array and count. The caller must free the results array using bptree_free_range_results.
bptree_free_range_results void Frees the array allocated by bptree_get_range.
bptree_get_stats bptree_stats Returns tree statistics, including key count, height, and node count of the tree.
bptree_count int Returns the number of key-value pairs in the tree. This is O(1).
bptree_height int Returns the current height of the tree. This is O(1).
bptree_max_keys int Returns the maximum number of keys per node (as configured at creation). This is O(1).
bptree_clear void Removes all elements and resets the tree to an empty state. The tree can be reused. Values are not freed.
bptree_check_invariants bool Checks structural correctness of the B+ tree (e.g., key ordering, node fill levels, and leaf depth).
Iterator Functions
Function Return Type Description
bptree_iter_begin bptree_iter Returns an iterator to the first (smallest-key) element. Invalid if the tree is empty.
bptree_iter_valid bool Checks whether an iterator points to a valid element (not past-end).
bptree_iter_next void Advances the iterator to the next element in key order.
bptree_iter_key bptree_key_t Returns the key at the current iterator position.
bptree_iter_value bptree_value_t Returns the value at the current iterator position.
bptree_iter_equal bool Checks whether two iterators point to the same position. Two invalid iterators are considered equal.
bptree_iter_find bptree_iter Returns an iterator to the element with the given key, or an invalid iterator if the key is not found.
bptree_iter_lower_bound bptree_iter Returns an iterator to the first element with a key >= the given key.
bptree_iter_upper_bound bptree_iter Returns an iterator to the first element with a key > the given key.
Type Description
bptree Opaque handle for a B+ tree. Access state through the accessor functions listed above.
bptree_stats The data type used for tree statistics (including key count, tree height, and node count).
bptree_iter Forward iterator for traversing the tree in key order.
bptree_key_t The data type used for keys (configurable; default: int64_t).
bptree_value_t The data type used for values (configurable; default: void *).
bptree_status Enum returned by most API functions showing success or failure (types) of operations.

Note

The comparator function should have a signature similar strcmp where it returns a negative value if the first argument should come before the second (e.g., a < b), zero if they are equal, and a positive value if the first argument should come after the second (e.g., a > b). The B+ tree implementation relies on this specific behavior for correctly ordering keys in nodes and traversing the tree during search, insertion, and deletion operations. The default comparators provided in your code (bptree_default_compare for numeric types using < and >, and the one using memcmp for strings) adhere to this convention.

Status Codes

// Status codes returned by bptree functions (defined in include/bptree.h)
typedef enum {
    BPTREE_OK = 0,             // Operation succeeded
    BPTREE_DUPLICATE_KEY,      // Key already exists during bptree_put
    BPTREE_KEY_NOT_FOUND,      // Key not found for get and remove operation
    BPTREE_ALLOCATION_FAILURE, // Memory allocation (e.g., malloc or aligned_alloc) failed
    BPTREE_INVALID_ARGUMENT,   // An invalid function argument provided
    BPTREE_INTERNAL_ERROR      // An unexpected internal state or error occurred
} bptree_status;

Key and Value Configuration

The key and value types, as well as linkage (static or dynamic), can be customized by defining these macros before including bptree.h where BPTREE_IMPLEMENTATION is defined:

Macro Description Default
BPTREE_NUMERIC_TYPE Use a specific integer or floating-point type for keys (if BPTREE_KEY_TYPE_STRING is not defined). int64_t
BPTREE_KEY_TYPE_STRING Define this macro (no value needed) to use fixed-size string keys instead of numeric keys. Not defined
BPTREE_KEY_SIZE Needed if BPTREE_KEY_TYPE_STRING is defined. Specifies the exact size (bytes) of the string key struct. Not defined
BPTREE_VALUE_TYPE Specifies the data type for values stored in the tree. void *
BPTREE_STATIC Define this macro (no value needed) along with BPTREE_IMPLEMENTATION to give the implementation static linkage. Not defined
Type Description Default
bptree_key_t The C type representing a key. Resolved based on the BPTREE_... macros above. BPTREE_NUMERIC_TYPE or char[BPTREE_KEY_SIZE]
bptree_value_t The C type representing a value. Resolved from BPTREE_VALUE_TYPE. void *

Important

The tree only manages memory for its internal nodes. If the bptree_value_t items stored are pointers (e.g., void * or struct my_data *), the caller is responsible for managing the memory pointed to by these values (allocation and deallocation). Consequently, bptree_free only frees the tree's internal structure, not any external data referenced by the stored values.

Examples

To use uint32_t keys and struct record * values:

#define BPTREE_NUMERIC_TYPE uint32_t
#define BPTREE_VALUE_TYPE struct record *
#define BPTREE_IMPLEMENTATION
#include "bptree.h"

To use fixed-size string keys (like 32-character strings) and store integer IDs as values:

#define BPTREE_KEY_TYPE_STRING
#define BPTREE_KEY_SIZE 32
#define BPTREE_VALUE_TYPE intptr_t // Store integers directly if they fit
#define BPTREE_IMPLEMENTATION
#include "bptree.h"

Tests and Benchmarks

To run the tests and benchmarks, use make test and make bench.

File Description
test_bptree.c Unit tests for the B+ tree API.
bench_bptree.c Benchmarks for some of the operations supported by the B+ tree.

Using Zig Build System

Alternatively, you can use the Zig build system to build and run the tests, benchmarks, and the example program.

zig build test      # Build and run unit tests
zig build bench     # Build and run benchmarks
zig build example   # Build and run the example

Contributing

See CONTRIBUTING.md for details on how to make a contribution.

License

Bptree is licensed under the MIT License (see LICENSE).