Skip to content

joaoh82/rust_sqlite

Repository files navigation

Rust-SQLite (SQLRite)

Build Status dependency status Coverage Status Maintenance MIT licensed

Rust-SQLite, aka SQLRite , is a simple embedded database modeled off SQLite, but developed with Rust. The goal is get a better understanding of database internals by building one.

What I cannot create, I do not understand. — Richard Feynman

key value
Design and discussions about direction
of the project going on over here.
Show us your support by buying us a coffee,
so we can keep building cool stuff. (coming soon)
Documentation (coming soon)
Come and Chat about databases with us sqlritedb discord server

Read the series of posts about it:

What would SQLite look like if written in Rust?

The SQLite Architecture

CREATE TABLE and INSERT Statements

asciicast

Desktop app

A cross-platform Tauri 2.0 + Svelte 5 desktop GUI ships alongside the REPL (see desktop/ and docs/desktop.md for details).

SQLRite Desktop

Launch it with cd desktop && npm install && npm run tauri dev. The header's New… / Open… / Save As… buttons cover the file lifecycle; the query editor has a live line-number gutter, ⌘/ (Ctrl+/) SQL comment toggle, and selection-aware Run (highlight a statement to run just that one).

Developer guide

In-depth documentation lives under docs/. Start at docs/_index.md — it navigates to:

Requirements

Before you begin, ensure you have met the following requirements:

Usage

Build and launch the REPL:

cargo run

You'll drop into a REPL connected to a transient in-memory database. On-disk persistence (.open, .save) is coming in Phase 2.

SQLRite - 0.1.0
Enter .exit to quit.
Enter .help for usage hints.
Connected to a transient in-memory database.
Use '.open FILENAME' to reopen on a persistent database.
sqlrite> CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT NOT NULL, age INTEGER);
sqlrite> INSERT INTO users (name, age) VALUES ('alice', 30);
sqlrite> INSERT INTO users (name, age) VALUES ('bob', 25);
sqlrite> SELECT name FROM users WHERE age > 25 ORDER BY age DESC LIMIT 5;
+-------+
| name  |
+-------+
| alice |
+-------+
SELECT Statement executed. 1 row returned.
sqlrite> UPDATE users SET age = age + 1 WHERE name = 'bob';
sqlrite> DELETE FROM users WHERE age < 30;

Supported SQL

Statement Features
CREATE TABLE PRIMARY KEY, UNIQUE, NOT NULL; duplicate-column detection; types INTEGER/INT/BIGINT/SMALLINT, TEXT/VARCHAR, REAL/FLOAT/DOUBLE/DECIMAL, BOOLEAN
CREATE [UNIQUE] INDEX single-column, named indexes; IF NOT EXISTS supported; persists as a dedicated cell-based B-Tree
INSERT INTO auto-ROWID for INTEGER PRIMARY KEY; UNIQUE enforcement via indexes; clean type errors (no panics)
SELECT * or column list, WHERE, ORDER BY col [ASC|DESC], LIMIT n. WHERE col = literal probes an index when one exists
UPDATE multi-column SET, WHERE; UNIQUE + type enforcement; arithmetic in assignments (SET age = age + 1)
DELETE WHERE predicate or full-table delete

Expressions in WHERE and SET:

  • Comparisons — =, <>, <, <=, >, >=
  • Logical — AND, OR, NOT
  • Arithmetic — +, -, *, /, % (integer ops stay integer; any REAL promotes to f64; divide/modulo by zero is a clean error)
  • String concat — ||
  • Literals — numbers, single-quoted strings, booleans, NULL; parentheses

Not yet implemented: joins, subqueries, GROUP BY / aggregates, DISTINCT, LIKE / IN / IS NULL, expressions in the projection list, OFFSET. See the Roadmap.

Meta commands

Command Status
.help working
.exit working
.open FILENAME working — opens an existing .sqlrite file or creates a fresh one; auto-save is enabled from this point on
.save FILENAME working — explicit flush (rarely needed once .open is in play)
.tables working
.read FILENAME later
.ast QUERY later

Roadmap

The project is staged in phases, each independently shippable. A finished phase is committed to main before the next one starts.

Phase 0 — Modernization (done)

  • Rust edition 2024, resolver 3, stable toolchain pinned via rust-toolchain.toml
  • Upgrade every dependency to current majors: rustyline 18, clap 4, sqlparser 0.61, thiserror 2, env_logger 0.11, prettytable-rs 0.10, serde / log latest

Phase 1 — SQL execution surface (done)

  • CLI + rustyline REPL with history, syntax highlighting, bracket matching, line validation
  • Parsing via sqlparser (SQLite dialect); typed SQLRiteError via thiserror
  • CREATE TABLE with PRIMARY KEY, UNIQUE, NOT NULL; duplicate-column detection; in-memory BTreeMap indexes on PK/UNIQUE columns
  • INSERT with auto-ROWID for INTEGER PRIMARY KEY, UNIQUE enforcement, NULL padding for missing columns
  • SELECT — projection, WHERE, ORDER BY, LIMIT (single-table, no joins yet)
  • UPDATE ... SET ... WHERE ... with type + UNIQUE enforcement at write time
  • DELETE ... WHERE ...
  • Expression evaluator: =/<>/</<=/>/>=, AND/OR/NOT, arithmetic +/-/*///%, string concat ||, NULL-as-false in WHERE
  • Replaced every .unwrap() panic on malformed input with typed errors

Phase 2 — On-disk persistence (done)

  • Single-file database format — one .sqlrite file per database
  • Fixed 4 KiB pages; page 0 carries a header (magic SQLRiteFormat\0\0\0, format version, page size, page count, schema-root page)
  • Typed payload pages (schema-root / table-data / overflow) chained via next-page pointers; payloads up to 4089 bytes before spilling into overflow
  • Schema catalog + per-table state serialized via bincode 2.0
  • .open FILENAME — create-or-load a database file
  • .save FILENAME — explicit flush of the in-memory DB to disk (auto-save arrives with Phase 3's pager)
  • .tables — list tables in the current database
  • Header written last during save, so a mid-save crash leaves the file recognizably unopenable

Phase 3 — On-disk B-Tree + auto-save pager (done)

  • 3a — Auto-save: every committing SQL statement (CREATE / INSERT / UPDATE / DELETE) against a file-backed DB auto-flushes; .save is now a rare manual flush
  • 3b — Pager abstraction: long-lived Pager holding a byte snapshot of every page on disk plus a staging area for the next commit; commit diffs staged vs. snapshot and writes only pages whose bytes actually changed; file truncates when the page count shrinks
  • 3c — Cell-based pages (format v2): rows stored as length-prefixed cells (tag-then-value encoding with null bitmap) in TableLeaf pages carrying a SQLite-style slot directory; oversized cells spill into an overflow page chain; the schema catalog itself is now a real table named sqlrite_master stored in the same cell format
  • 3d — B-Tree: InteriorNode pages above the existing leaves; save rebuilds the tree bottom-up from the in-memory sorted rows; open descends to the leftmost leaf and scans forward via the sibling next_page chain. Interior cells share the cell_length | kind_tag | body prefix with local/overflow cells so binary search over slot directories works uniformly. Cursor / lazy-load reads deferred to Phase 5.
  • 3e — Secondary indexes (format v3): UNIQUE/PRIMARY KEY columns get an auto-index named sqlrite_autoindex_<table>_<col> at CREATE TABLE time; CREATE [UNIQUE] INDEX name ON table (col) adds explicit single-column indexes. sqlrite_master gains a type column distinguishing 'table' rows from 'index' rows. Each index persists as its own cell-based B-Tree using KIND_INDEX cells (rowid, value). Executor optimizer probes indexes for WHERE col = literal (and literal = col) instead of full-scanning.

Phase 2.5 — Tauri 2.0 desktop app (done)

  • Engine split into lib + bin (pulled forward from Phase 5): sqlrite is now both a library and a binary. The Tauri app and the eventual WASM / FFI targets all import the engine as a regular Rust dependency.
  • Thread-safe engine: Table's row storage switched from Rc<RefCell<_>> to Arc<Mutex<_>> so Database is Send + Sync and can live inside Tauri's shared state. The serde derives on storage types (dead since 3c.5) dropped at the same time.
  • Workspace: root Cargo.toml is now a Cargo workspace; desktop/src-tauri/ is the second member.
  • Tauri 2.0 backend: four commands (open_database, list_tables, table_rows, execute_sql) wrap the engine; results are tagged enums shipped to the UI via the JSON IPC bridge.
  • Svelte 5 frontend: dark-themed three-pane layout — header with "Open…" file picker, sidebar with table list + schema, query editor with Cmd/Ctrl+Enter to run, result grid with sticky header.

Phase 4 — Durability and concurrency (in progress)

  • 4a — Exclusive file lock: Pager::open / ::create takes an OS advisory lock (fs2::try_lock_exclusive); a second process on the same file gets a clean "already in use" error. Lock releases automatically when the Pager drops.
  • 4b — Write-Ahead Log (<db>.sqlrite-wal) file format + frame codec: 32-byte WAL header (magic / version / page size / salt / checkpoint seq), 4112-byte frames carrying (page_num, commit_page_count, salt, checksum, body). Rolling-sum checksum. Torn-write recovery: corrupt or partial trailing frames are silently truncated at the boundary. Standalone module; not wired yet.
  • 4c — WAL-aware Pager: Pager::open / ::create now own both the main file and its -wal sidecar. Reads resolve staged → wal_cache → on_disk with a page-count bounds check; commits append a WAL frame per dirty page plus a final commit frame carrying the new page 0 (encoded header). The main file stays frozen between checkpoints — reopening replays the WAL and the decoded page-0 frame overrides the (stale) main-file header.
  • 4d — Checkpointer: Pager::checkpoint() folds WAL-resident pages into the main file, rewrites the header, truncates the tail, fsyncs, then Wal::truncates the sidecar (rolling the salt). Auto-fires from commit past a 100-frame threshold; also callable explicitly. Crash-safe and idempotent — a crash mid-checkpoint leaves the WAL as the source of truth, so reads stay correct and a retry rewrites the same bytes.
  • 4e — Multi-reader / single-writer: new AccessMode { ReadWrite, ReadOnly } drives lock mode. Pager::open_read_only takes a shared lock (flock(LOCK_SH)) on both the main file and the WAL; open / create stay exclusive. Multiple RO openers coexist; any writer excludes all readers (POSIX flock semantics — "multiple readers OR one writer", not both). Read-only Pagers reject writes with a typed error. REPL gained a --readonly flag; library exposes sqlrite::open_database_read_only. Read marks aren't needed under flock — a writer can't coexist with readers, so the checkpointer never pulls frames out from under them.
  • 4f — Transactions (BEGIN / COMMIT / ROLLBACK): BEGIN snapshots the in-memory tables (Table::deep_clone) and suppresses auto-save; every subsequent mutation stays in memory. COMMIT flushes accumulated changes in one save_database call (one WAL commit frame for the whole transaction). ROLLBACK restores the pre-BEGIN snapshot. Nested begins, orphan commits/rollbacks, and BEGIN on read-only DBs all return typed errors. Errors mid-transaction keep the transaction open so the caller can explicitly recover.

Phase 5 — Embedding surface: public API + language SDKs

  • 5a — Public Rust API (partial): Connection / Statement / Rows / Row / OwnedRow / FromValue / Value at the crate root; structured row return from the executor; examples/rust/quickstart.rs runnable via cargo run --example quickstart. Parameter binding + cursor abstraction deferred to 5a.2.
  • 5b — C FFI shim: new sqlrite-ffi/ workspace crate ships libsqlrite_c.{so,dylib,dll} + a cbindgen-generated sqlrite.h. Opaque-pointer types, thread-local last-error, split sqlrite_execute (DDL/DML/transactions) vs sqlrite_query/sqlrite_step (SELECT iteration). Runnable examples/c/hello.c + Makefile (cd examples/c && make run).
  • 5c — Python SDK: new sdk/python/ workspace crate via PyO3 (abi3-py38) + maturin. DB-API 2.0-inspired — sqlrite.connect(path)Cursor.execute / fetchall / iteration, context-manager support (commit-on-clean-exit / rollback-on-exception), read-only connections, 16-test pytest suite. examples/python/hello.py runs after maturin develop. PyPI publish lands in Phase 6e.
  • 5d — Node.js SDK: new sdk/nodejs/ workspace crate via napi-rs (N-API v9, Node 18+). Prebuilt .node binaries — no node-gyp install step. better-sqlite3-style sync API (new Database(path), stmt.all() / get() / iterate() returning row objects), auto-generated TypeScript defs, 11 node:test integration tests. examples/nodejs/hello.mjs runs after npm install && npm run build. npm publish lands in Phase 6e.
  • 5e — Go SDK: new sdk/go/ module at github.com/joaoh82/rust_sqlite/sdk/go; cgo-wired against libsqlrite_c from Phase 5b. Implements the full database/sql/driver surface so users get the standard-library experience (sql.Open("sqlrite", path), db.Query/Exec/Begin, rows.Scan(&id, &name)). 9-test go test integration suite. examples/go/hello.go runs after cargo build --release -p sqlrite-ffi. Go modules publish via sdk/go/v* git tags in Phase 6e.
  • 5f — Rust crate polish (deferred — Phase 6c companion): crate metadata, docs.rs config, prep for cargo publish. Deferred to land alongside the actual publish workflow.
  • 5g — WASM build: new sdk/wasm/ crate via wasm-bindgen; engine runs entirely in a browser tab. Feature-gated root crate (cli + file-locks optional, both default-on) so WASM disables fs2 / rustyline / clap / env_logger cleanly. Database class with exec/query/columns/inTransaction; rows as plain JS objects in projection order. ~1.8 MB wasm / ~500 KB gzipped. Three wasm-pack targets (web/bundler/nodejs). examples/wasm/ ships a self-contained HTML SQL console.
  • Code examples for every language under examples/{rust,python,nodejs,go,wasm}/

Phase 6 — Release engineering + CI/CD Lockstep versioning — one dispatch bumps every product to the same vX.Y.Z. Two-workflow design: release-pr.yml opens a Release PR with the version bumps (human reviews + merges), then release.yml fires on merge to tag + publish everything. Trusted-publishing via OIDC for PyPI + npm (no long-lived tokens). Full plan: docs/release-plan.md.

  • 6a — Bump script: scripts/bump-version.sh rewrites the version string in ten manifests (7 TOML, 3 JSON) in a single pass; semver-validated, idempotent, cross-platform (BSD + GNU sed). Runnable locally for rehearsing a release: ./scripts/bump-version.sh 0.2.0 && cargo build && git diff.
  • 6b — CI: .github/workflows/ci.yml runs on every PR + push to main. Seven parallel jobs: rust-build-and-test (Linux/macOS/Windows × cargo build + test), rust-lint (fmt + clippy + doc), python-sdk (Linux/macOS/Windows × maturin develop + pytest in a venv), nodejs-sdk (Linux/macOS/Windows × napi build + node --test), go-sdk (Linux/macOS × cargo build sqlrite-ffi + go test), wasm-build (wasm-pack + size report), desktop-build (npm ci + Tauri Rust compile). Cargo / npm / pip caching for fast PR turnaround.
  • 6c — Trusted publisher setup + branch protection runbook: docs/release-secrets.md captures the one-time web-UI setup — crates.io token in the release environment, OIDC trusted publishers on PyPI (sqlrite) and npm (sqlrite + sqlrite-wasm), GitHub release environment with required reviewer, branch protection on main requiring 14 CI jobs + 1 review. No code changes — executable as-is, ready to run through in the GitHub + registry UIs.
  • 6d — Release PR + skeleton publish: two workflows under .github/workflows/. release-pr.yml (manual dispatch with version input → bump-version.sh → PR), release.yml (fires on release: v<semver> merge commit → tag-all + publish-crate + publish-ffi matrix [linux x86_64/aarch64, macOS aarch64, windows x86_64] + umbrella release). Idempotent tag creation so "Re-run failed jobs" works after partial failures. cargo publish gated by the release environment's required-reviewer rule. First canary: v0.1.1.
  • 6e — Desktop publish: add publish-desktop to release.yml — Tauri build matrix → unsigned .AppImage / .deb / .dmg / .msi → GitHub Release
  • 6f — Python SDK publish: maturin-action → abi3 wheels for manylinux x86_64/aarch64 + macOS universal + Windows x86_64 → PyPI via OIDC
  • 6g — Node.js SDK publish: @napi-rs/cli.node binaries per platform → npm via OIDC
  • 6h — WASM publish: wasm-pack publishsqlrite-wasm on npm
  • 6i — Go SDK publish: sdk/go/vX.Y.Z git tag + attach FFI tarballs to the Go GitHub Release for go get users who want prebuilt libsqlrite_c

Phase 6.1 — Code signing (follow-up)

  • macOS Apple Developer ID cert → codesign + notarytool in tauri-action
  • Windows code-signing cert → signtool in tauri-action

Phase 7 — AI-era extensions (research)

  • Vector / embedding column type with an ANN index
  • Natural-language → SQL front-end that emits queries against this engine
  • Other agent-era ideas as they emerge

Possible extras (no committed phase)

  • Joins (INNER, LEFT OUTER, CROSS — SQLite does not support RIGHT/FULL OUTER)
  • GROUP BY, aggregates (COUNT, SUM, AVG, ...), DISTINCT, LIKE, IN, IS NULL
  • Composite and expression indexes (with cost analysis)
  • Alternate storage engines — LSM/SSTable for write-heavy workloads alongside the B-Tree
  • Benchmarks against SQLite

Contributing

Pull requests are warmly welcome!!!

For major changes, please open an issue first and let's talk about it. We are all ears!

If you'd like to contribute, please fork the repository and make changes as you'd like and shoot a Pull Request our way!

Please make sure to update tests as appropriate.

If you feel like you need it go check the GitHub documentation on creating a pull request.

Code of Conduct

Contribution to the project is organized under the terms of the Contributor Covenant, the maintainer of SQLRite, @joaoh82, promises to intervene to uphold that code of conduct.

Contact

If you want to contact me you can reach me at joaoh82@gmail.com.

Inspiration

About

SQLRite - Simple embedded database modeled off SQLite in Rust

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

 
 
 

Contributors