Warning
Proof of Concept & Research Only
This project is strictly for educational and research purposes. It is designed to demonstrate the implementation of distributed consensus, vector indexing, and bi-temporal data management in Rust.
It is not intended for production use, and active development is not guaranteed. There are no warranties regarding data integrity or security.
ChronosDB is an experimental high-performance distributed database built in Rust. It uniquely combines Vector Similarity Search (for AI applications) with Bi-Temporal Data Management (Time Travel) and Distributed Consensus (Raft).
It solves the problem of "lossy" AI memory by ensuring that every vector embedding ever written is preserved in a strictly append-only history, reachable via time-travel queries.
- Distributed Consensus (Raft): Built on
openraft, ensuring linearizable writes and automatic leader election. Supports dynamic membership changes (adding/removing nodes) without downtime. - Vector Search Engine: Custom implementation of HNSW (Hierarchical Navigable Small World) graph for approximate nearest neighbor search. Supports SIMD-optimized Euclidean and Cosine distance metrics.
- Time Travel (History): Every record maintains a
valid_timeandtx_time. Data is never overwritten; it is appended. You can query the full history of any object using theHISTORYcommand. - Disk-Based Architecture: Uses memory-mapped files (
mmap) for managing storage segments. It supports both "Strict Durability" (fsync) and "High Throughput" (async) modes based on hardware detection. - Zero-Copy Serialization: Utilizes
rkyvfor guaranteeing data layout alignment and zero-copy deserialization from disk. - Binary Snapshots: Uses
rkyvfor compact, high-speed binary snapshots, significantly reducing storage size and recovery time compared to JSON. - Probabilistic Filtering: Implements Bloom Filters and SeaHash to minimize disk reads for non-existent keys.
ChronosDB uses an immutable, append-only log structure to ensure crash safety and historical retention.
- Segments: Data is written to 64MB memory-mapped file segments.
- Records: Each entry contains a 128-dimensional vector, a binary payload, and temporal metadata.
- Safety: The system detects concurrency levels and adjusts
fsyncbehavior automatically via system profiling.
Vector indexing is handled by a persistent HNSW graph that updates in real-time.
- Nodes: Graph nodes are stored in a separate optimized index file.
- Search: Uses a priority queue-based search (beam search) to traverse graph layers.
- Recall: Tuned with
M=16andef_construction=100for high recall on random data distributions.
- Replication: Logs are replicated to a quorum of nodes via the Raft protocol.
- Snapshotting: Supports auto-snapshotting based on log length (default: every 20 logs) to compact the Write Ahead Log (WAL). Snapshots are serialized in an optimized binary format.
- Network protocol: Raft control plane uses direct Tokio TCP with a binary frame:
[4-byte BE len] [1-byte route] [rkyv payload bytes]- routes:
1forvote,2forappend_entries,3forinstall_snapshot - payloads are
rkyv-aligned.
- Client data API (main query transport): writes now include an operation flag byte for
RETURNINGin the payload (e.g.0x01forRETURNING id). Responses are always[1-byte status][4-byte LE payload_len][payload_bytes]. - Control API: Exposes an HTTP admin API for bootstrapping and membership operations (default port 20002).`
Prerequisites:
- Rust
- Cargo
Build:
# Clone the repository
git clone https://github.com/SSL-ACTX/chronos-db.git
cd chronos-db
cargo build --releaseRun a Single Node:
# Runs on TCP 9000, Raft API 20001
./target/release/chronos --node-id 1
ChronosDB includes a custom SQL-like parser and dedicated CLI client. The full usage guide is in docs/usage.md.
cargo run --bin chronos-cliINSERT/UPDATE/DELETE(with optionalRETURNING id), all routed through Raft write path.SELECT/ vector search.GETby ID.HISTORYandAS OFtime-travel.
For detailed examples and commands, see:
ChronosDB uses a custom binary protocol for data transmission and a dedicated HTTP control path for cluster bootstrap/membership.
A helper script test_cluster.py is provided to bootstrap a 3-node cluster locally.
- Start Nodes: Nodes must be started with unique IDs and ports.
- Node 1: Client TCP 9000, Raft TCP 20001, Control HTTP 20002
- Node 2: Client TCP 9001, Raft TCP 20002, Control HTTP 20003
- Node 3: Client TCP 9002, Raft TCP 20003, Control HTTP 20004
- Initialize Leader: Bootstrapping can be automatic on node startup:
./target/release/chronos --node-id 1 --addr 127.0.0.1:9000 --raft-port 20001 --control-port 20002 --bootstrapOr with explicit control API call:
curl -X POST http://127.0.0.1:20002/init- Add Learners & Promote: Add Node 2 and Node 3 to the cluster via the control API:
curl -X POST -H "Content-Type: application/json" -d '{"id":2,"addr":"127.0.0.1:20002","auto_promote":true}' http://127.0.0.1:20002/add-learner
curl -X POST -H "Content-Type: application/json" -d '{"id":3,"addr":"127.0.0.1:20003","auto_promote":true}' http://127.0.0.1:20002/add-learner
curl -X POST -H "Content-Type: application/json" -d '[1,2,3]' http://127.0.0.1:20002/change-membershipThe cluster automatically creates snapshots when the log grows too large. This allows new nodes to catch up by downloading a compressed state rather than replaying the entire history.
- Trigger: Default policy creates a snapshot every 20 logs.
- Recovery: Nodes automatically restore HNSW graphs and Bloom filters from snapshots upon restart.
The project includes Python integration tests for verifying cluster consistency and snapshot logic.
-
Cluster Test:
python3 test_cluster.py -
Verifies Write -> Kill Leader -> Election -> Write -> Read Consistency.
-
Snapshot Test:
python3 test_snapshot.py -
Inserts data past the log limit, triggers a snapshot, adds a new node, and verifies the new node hydrated correctly from the binary snapshot.
Buggy
- CLI Integration:
python3 test_cli.py - Verifies the full SQL grammar, vector padding, and CRUD lifecycle.
ChronosDB automatically detects environment resources via a System Profile:
| Hardware | Mode | Behavior |
|---|---|---|
| 1 Core | Potato Mode | No fsync (Async durability), High Raft timeout (1s). |
| < 4 Cores | Standard | Strict durability, 500ms heartbeat. |
| Server | Server Mode | Strict durability, 250ms heartbeat, max worker threads. |
Author: Seuriin (SSL-ACTX)