ColQL is a zero-dependency, process-local, in-memory columnar query engine for TypeScript apps that need compact storage, typed schemas, explicit indexes, and safe mutations.
It is not a SQL database, persistence layer, distributed system, or durable storage format. ColQL is for data you already want to keep inside a Node.js process.
- Compact columnar storage backed by typed arrays, dictionaries, and bit-packed booleans
- Lazy queries with filtering, projection, aggregation, streaming, limit, and offset
- Object predicates plus tuple-style
where(column, operator, value) - Explicit equality indexes and sorted numeric indexes for hot predicates
- Internal background-indexing architecture for large dirty-index rebuilds
- Unique indexes for stable ID lookups and duplicate-key protection
- Public
query.explain()diagnostics for planner visibility without executing queries - JS Array migration helpers such as
fromRows,firstWhere,countWhere, andexists - Mutable tables with
updateManyanddeleteMany - Runtime validation with structured
ColQLErrorcodes - Binary serialization for table data
- TypeScript inference for rows, predicates, projections, and mutation payloads
- Zero runtime dependencies
npm install @colql/colqlimport { column, table } from "@colql/colql";
const users = table({
id: column.uint32(),
age: column.uint8(),
status: column.dictionary(["active", "passive", "archived"] as const),
score: column.float64(),
verified: column.boolean(),
});
users.insertMany([
{ id: 1, age: 29, status: "active", score: 91.5, verified: true },
{ id: 2, age: 17, status: "passive", score: 72.0, verified: false },
{ id: 3, age: 44, status: "active", score: 88.2, verified: true },
]);
users.createIndex("status");
users.createSortedIndex("age");
users.createUniqueIndex("id");
const activeAdults = users
.where({
status: "active",
age: { gte: 18 },
})
.select(["id", "age", "score"])
.toArray();
console.log(
users.where({ status: "active", age: { gte: 18 } }).select(["id"]).explain(),
);
const result = users.updateMany(
{ status: "passive", age: { lt: 18 } },
{ status: "archived" },
);
console.log(activeAdults);
console.log(result.affectedRows);
console.log(users.findBy("id", 1));Users define indexes; ColQL manages index lifecycle internally. Equality and sorted indexes are derived performance structures, and query results must match full-scan results whether an index is fresh, dirty, queued, rebuilding, failed, or absent.
v0.6.0 adds the internal storage descriptors, lifecycle states, generation checks, bounded worker infrastructure, real Node worker_threads executor, atomic apply guards, diagnostics, and benchmarks needed for background equality and sorted rebuilds. Normal public query APIs remain synchronous, and the public default execution path still preserves the v0.5-style synchronous dirty-index rebuild fallback until automatic background scheduling is exposed.
Background worker rebuilds are not a universal speedup. Small and medium datasets may be faster with synchronous rebuild because worker startup, message passing, and merge overhead can dominate. The worker path is intended to isolate user-facing latency for large dirty index rebuilds, especially sorted indexes.
Queued, rebuilding, and failed indexes are not used for query results. ColQL falls back to a scan or another fresh index to preserve correctness. query.explain() remains non-executing: it does not scan rows, materialize rows, call onQuery, rebuild dirty indexes, or schedule background jobs.
Serialization stores table data only. It does not serialize trusted indexes, index lifecycle state, or worker jobs; recreate indexes after restore when indexed performance or unique-index enforcement is needed.
ColQL includes a Fastify example that can boot with 1M deterministic rows and exercise indexed, range, scan, callback-filter, mutation, stress, and memory paths.
These are local reference numbers from examples/fastify-api after the example's mutation validation path, not guarantees. Actual numbers vary by Node.js version, CPU, memory pressure, data distribution, query selectivity, projection size, and mutation frequency.
| 1M-row workload | Local avg | Expected shape |
|---|---|---|
Selective equality query with createIndex() |
2.06ms | Fastest path when candidate sets are small |
Numeric range query with createSortedIndex() |
27.69ms | Helps selective ranges; broad ranges may resemble scans |
| Broad structured predicate | 25.31ms | May intentionally scan when an index is not selective |
filter(fn) callback predicate |
218.93ms | Full-scan escape hatch; not index-aware |
ColQL also includes a JS Array comparison benchmark. It is meant to show tradeoffs, not to prove that ColQL is always faster. In local 1M-row runs, JS arrays are often faster for simple broad scans and callback predicates, while ColQL is strongest when compact storage, projection with limits, structured predicates, or explicit indexed lookups matter.
Run the example locally:
cd examples/fastify-api
npm install
npm run test:largeFor benchmark scripts and interpretation notes, see Performance and Benchmarks.
For PR-level regression checks, run npm run bench:codspeed; CodSpeed uses smaller deterministic datasets as regression signals, not absolute throughput claims. Setup is excluded where possible, while destructive mutation benchmarks that need fresh tables use setup-inclusive names. Larger 1M-row comparisons remain covered by the local/manual benchmark scripts.
For JS Array comparisons, run npm run benchmark:array-comparison; results are local guidance, not universal promises or CI requirements.
For a scenario-style local workload, run npm run benchmark:session-analytics.
For background-indexing regression visibility, run npm run benchmark:background-indexing -- --json and npm run benchmark:worker-runtime -- --json.
Use ColQL when:
- you need to keep thousands to millions of records in memory
- JavaScript object arrays use too much memory
- filters and aggregations should avoid intermediate arrays
- a TypeScript schema can describe your columns
- explicit indexes are acceptable for hot equality or range predicates
- stable identity can be modeled with an explicit ID column and unique index
- runtime validation matters because data may come from untyped sources
Avoid ColQL when:
- you need durable storage, transactions, joins, or SQL
- data must be shared across pods, workers, processes, or machines
- writes dominate the workload and broad indexes are frequently dirtied
- row indexes must be stable external identifiers
- a small/simple JavaScript array is already clear and fast enough
- every query requires arbitrary sorting or grouping
- you need concurrent writers or multi-process coordination
- you want automatic indexes, compound indexes, or query planning across tables
- you want analytical SQL over files or large columnar datasets, where DuckDB may be a better fit
Row indexes are physical positions and can change after deletes. Use an explicit id column for stable identity.
| Tool | Good fit | Not a good fit |
|---|---|---|
| ColQL | Process-local TypeScript data that benefits from compact in-memory columns, explicit indexes, validation, and inspectable query plans | Persistence, SQL, joins, transactions, shared state, or distributed coordination |
| JavaScript arrays | Small or simple datasets, ad hoc transforms, or write-heavy logic where object arrays are already clear and fast enough | Memory-sensitive data, repeated projections, structured indexed lookups, or runtime schema validation |
| SQLite | Durable embedded relational storage with SQL, transactions, and indexes | Pure process-local ephemeral caches where a database file and SQL layer are unnecessary |
| DuckDB | Analytical SQL, file-based analytics, large columnar datasets, and ad hoc aggregations | TypeScript-first mutable process-local tables with explicit in-memory indexes |
The Fastify example demonstrates HTTP query params mapped to object predicates, range queries, filter(fn), updateMany, deleteMany, query diagnostics, index stats, and memory counters.
Detailed documentation is available under docs/doc.
Recommended reading:
- Overview
- Installation
- Schema and Columns
- Querying
- Equality Indexes
- Sorted Indexes
- Unique Indexes
- Mutations
- Serialization
- Memory Model
- Performance and Benchmarks
- Limitations and Design Decisions
- API Reference
- Release Checklist
users.insert(row);
users.insertMany(rows);
users.where({ status: "active", age: { gte: 18 } }).toArray();
users.where("age", ">=", 18).select(["id"]).toArray();
users.whereIn("status", ["active", "passive"]);
users.filter((row) => row.score > 90);
users.where({ status: "active" }).explain();
users.count();
users.avg("age");
users.top(10, "score");
users.update(0, { status: "active" });
users.updateMany({ status: "passive" }, { status: "active" });
users.deleteMany({ status: "archived" });
users.createIndex("id");
users.createSortedIndex("age");
users.createUniqueIndex("id");
users.findBy("id", 123);
const buffer = users.serialize();
const restored = table.deserialize(buffer);
restored.createIndex("status");filter(fn) is intentionally a full-scan escape hatch. Prefer structured predicates when you want index planning.
query.explain() returns structured diagnostics without executing the query, scanning rows, materializing rows, calling onQuery, or rebuilding dirty indexes.
ColQL validates schemas, inserted rows, query predicates, mutation payloads, indexes, and serialized input at runtime. Failures throw ColQLError with a stable code, a message, and optional details.
import { ColQLError } from "@colql/colql";
try {
users.insert({ id: 4, age: 300, status: "active", score: 1, verified: true });
} catch (error) {
if (error instanceof ColQLError) {
console.log(error.code); // COLQL_OUT_OF_RANGE
}
}npm install
npm run check
npm test
npm run test:types
npm run build
npm run test:worker-runtime
npm run bench:codspeed
npm run benchmark:memory
npm run benchmark:query
npm run benchmark:indexed
npm run benchmark:range
npm run benchmark:optimizer
npm run benchmark:serialization
npm run benchmark:delete
npm run benchmark:array-comparison
npm run benchmark:session-analytics
npm run benchmark:background-indexing -- --json
npm run benchmark:worker-runtime -- --jsonColQL v0.6.0 focuses on zero-copy background-indexing architecture while preserving synchronous public query APIs and v0.5-style synchronous dirty-index fallback behavior. Breaking changes may still happen before 1.0.0, but the project is moving toward API stabilization.
ColQL intentionally does not include SQL parsing, joins, transactions, concurrency control, automatic index creation, compound indexes, or durable storage. Equality and sorted indexes are derived performance structures; query results must be the same whether ColQL uses an index or a full scan. Unique indexes are derived too, but they also enforce uniqueness while present and are not serialized.