GitHub - emremy/ColQL: In-memory columnar query engine for TypeScript with low memory usage, explicit indexes, and fast filtering

ColQL is a zero-dependency, process-local, in-memory columnar query engine for TypeScript apps that need compact storage, typed schemas, explicit indexes, and safe mutations.

It is not a SQL database, persistence layer, distributed system, or durable storage format. ColQL is for data you already want to keep inside a Node.js process.

Why ColQL?

Compact columnar storage backed by typed arrays, dictionaries, and bit-packed booleans
Lazy queries with filtering, projection, aggregation, streaming, limit, and offset
Object predicates plus tuple-style where(column, operator, value)
Explicit equality indexes and sorted numeric indexes for hot predicates
Internal background-indexing architecture for large dirty-index rebuilds
Unique indexes for stable ID lookups and duplicate-key protection
Public query.explain() diagnostics for planner visibility without executing queries
JS Array migration helpers such as fromRows, firstWhere, countWhere, and exists
Mutable tables with updateMany and deleteMany
Runtime validation with structured ColQLError codes
Binary serialization for table data
TypeScript inference for rows, predicates, projections, and mutation payloads
Zero runtime dependencies

Install

npm install @colql/colql

Quick Example

import { column, table } from "@colql/colql";

const users = table({
  id: column.uint32(),
  age: column.uint8(),
  status: column.dictionary(["active", "passive", "archived"] as const),
  score: column.float64(),
  verified: column.boolean(),
});

users.insertMany([
  { id: 1, age: 29, status: "active", score: 91.5, verified: true },
  { id: 2, age: 17, status: "passive", score: 72.0, verified: false },
  { id: 3, age: 44, status: "active", score: 88.2, verified: true },
]);

users.createIndex("status");
users.createSortedIndex("age");
users.createUniqueIndex("id");

const activeAdults = users
  .where({
    status: "active",
    age: { gte: 18 },
  })
  .select(["id", "age", "score"])
  .toArray();

console.log(
  users.where({ status: "active", age: { gte: 18 } }).select(["id"]).explain(),
);

const result = users.updateMany(
  { status: "passive", age: { lt: 18 } },
  { status: "archived" },
);

console.log(activeAdults);
console.log(result.affectedRows);

console.log(users.findBy("id", 1));

Index Lifecycle

Users define indexes; ColQL manages index lifecycle internally. Equality and sorted indexes are derived performance structures, and query results must match full-scan results whether an index is fresh, dirty, queued, rebuilding, failed, or absent.

v0.6.0 adds the internal storage descriptors, lifecycle states, generation checks, bounded worker infrastructure, real Node worker_threads executor, atomic apply guards, diagnostics, and benchmarks needed for background equality and sorted rebuilds. Normal public query APIs remain synchronous, and the public default execution path still preserves the v0.5-style synchronous dirty-index rebuild fallback until automatic background scheduling is exposed.

Background worker rebuilds are not a universal speedup. Small and medium datasets may be faster with synchronous rebuild because worker startup, message passing, and merge overhead can dominate. The worker path is intended to isolate user-facing latency for large dirty index rebuilds, especially sorted indexes.

Queued, rebuilding, and failed indexes are not used for query results. ColQL falls back to a scan or another fresh index to preserve correctness. query.explain() remains non-executing: it does not scan rows, materialize rows, call onQuery, rebuild dirty indexes, or schedule background jobs.

Serialization stores table data only. It does not serialize trusted indexes, index lifecycle state, or worker jobs; recreate indexes after restore when indexed performance or unique-index enforcement is needed.

Performance Snapshot

ColQL includes a Fastify example that can boot with 1M deterministic rows and exercise indexed, range, scan, callback-filter, mutation, stress, and memory paths.

These are local reference numbers from examples/fastify-api after the example's mutation validation path, not guarantees. Actual numbers vary by Node.js version, CPU, memory pressure, data distribution, query selectivity, projection size, and mutation frequency.

1M-row workload	Local avg	Expected shape
Selective equality query with `createIndex()`	2.06ms	Fastest path when candidate sets are small
Numeric range query with `createSortedIndex()`	27.69ms	Helps selective ranges; broad ranges may resemble scans
Broad structured predicate	25.31ms	May intentionally scan when an index is not selective
`filter(fn)` callback predicate	218.93ms	Full-scan escape hatch; not index-aware

ColQL also includes a JS Array comparison benchmark. It is meant to show tradeoffs, not to prove that ColQL is always faster. In local 1M-row runs, JS arrays are often faster for simple broad scans and callback predicates, while ColQL is strongest when compact storage, projection with limits, structured predicates, or explicit indexed lookups matter.

Run the example locally:

cd examples/fastify-api
npm install
npm run test:large

For benchmark scripts and interpretation notes, see Performance and Benchmarks. For PR-level regression checks, run npm run bench:codspeed; CodSpeed uses smaller deterministic datasets as regression signals, not absolute throughput claims. Setup is excluded where possible, while destructive mutation benchmarks that need fresh tables use setup-inclusive names. Larger 1M-row comparisons remain covered by the local/manual benchmark scripts. For JS Array comparisons, run npm run benchmark:array-comparison; results are local guidance, not universal promises or CI requirements. For a scenario-style local workload, run npm run benchmark:session-analytics. For background-indexing regression visibility, run npm run benchmark:background-indexing -- --json and npm run benchmark:worker-runtime -- --json.

When To Use ColQL

Use ColQL when:

you need to keep thousands to millions of records in memory
JavaScript object arrays use too much memory
filters and aggregations should avoid intermediate arrays
a TypeScript schema can describe your columns
explicit indexes are acceptable for hot equality or range predicates
stable identity can be modeled with an explicit ID column and unique index
runtime validation matters because data may come from untyped sources

Avoid ColQL when:

you need durable storage, transactions, joins, or SQL
data must be shared across pods, workers, processes, or machines
writes dominate the workload and broad indexes are frequently dirtied
row indexes must be stable external identifiers
a small/simple JavaScript array is already clear and fast enough
every query requires arbitrary sorting or grouping
you need concurrent writers or multi-process coordination
you want automatic indexes, compound indexes, or query planning across tables
you want analytical SQL over files or large columnar datasets, where DuckDB may be a better fit

Row indexes are physical positions and can change after deletes. Use an explicit id column for stable identity.

Decision Guide

Tool	Good fit	Not a good fit
ColQL	Process-local TypeScript data that benefits from compact in-memory columns, explicit indexes, validation, and inspectable query plans	Persistence, SQL, joins, transactions, shared state, or distributed coordination
JavaScript arrays	Small or simple datasets, ad hoc transforms, or write-heavy logic where object arrays are already clear and fast enough	Memory-sensitive data, repeated projections, structured indexed lookups, or runtime schema validation
SQLite	Durable embedded relational storage with SQL, transactions, and indexes	Pure process-local ephemeral caches where a database file and SQL layer are unnecessary
DuckDB	Analytical SQL, file-based analytics, large columnar datasets, and ad hoc aggregations	TypeScript-first mutable process-local tables with explicit in-memory indexes

Examples

The Fastify example demonstrates HTTP query params mapped to object predicates, range queries, filter(fn), updateMany, deleteMany, query diagnostics, index stats, and memory counters.

Documentation

Detailed documentation is available under docs/doc.

Common APIs

users.insert(row);
users.insertMany(rows);

users.where({ status: "active", age: { gte: 18 } }).toArray();
users.where("age", ">=", 18).select(["id"]).toArray();
users.whereIn("status", ["active", "passive"]);
users.filter((row) => row.score > 90);
users.where({ status: "active" }).explain();

users.count();
users.avg("age");
users.top(10, "score");

users.update(0, { status: "active" });
users.updateMany({ status: "passive" }, { status: "active" });
users.deleteMany({ status: "archived" });

users.createIndex("id");
users.createSortedIndex("age");
users.createUniqueIndex("id");
users.findBy("id", 123);

const buffer = users.serialize();
const restored = table.deserialize(buffer);
restored.createIndex("status");

filter(fn) is intentionally a full-scan escape hatch. Prefer structured predicates when you want index planning. query.explain() returns structured diagnostics without executing the query, scanning rows, materializing rows, calling onQuery, or rebuilding dirty indexes.

Error Handling

ColQL validates schemas, inserted rows, query predicates, mutation payloads, indexes, and serialized input at runtime. Failures throw ColQLError with a stable code, a message, and optional details.

import { ColQLError } from "@colql/colql";

try {
  users.insert({ id: 4, age: 300, status: "active", score: 1, verified: true });
} catch (error) {
  if (error instanceof ColQLError) {
    console.log(error.code); // COLQL_OUT_OF_RANGE
  }
}

Development

npm install
npm run check
npm test
npm run test:types
npm run build
npm run test:worker-runtime
npm run bench:codspeed
npm run benchmark:memory
npm run benchmark:query
npm run benchmark:indexed
npm run benchmark:range
npm run benchmark:optimizer
npm run benchmark:serialization
npm run benchmark:delete
npm run benchmark:array-comparison
npm run benchmark:session-analytics
npm run benchmark:background-indexing -- --json
npm run benchmark:worker-runtime -- --json

Status

ColQL v0.6.0 focuses on zero-copy background-indexing architecture while preserving synchronous public query APIs and v0.5-style synchronous dirty-index fallback behavior. Breaking changes may still happen before 1.0.0, but the project is moving toward API stabilization.

Limitations

ColQL intentionally does not include SQL parsing, joins, transactions, concurrency control, automatic index creation, compound indexes, or durable storage. Equality and sorted indexes are derived performance structures; query results must be the same whether ColQL uses an index or a full scan. Unique indexes are derived too, but they also enforce uniqueness while present and are not serialized.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.agents/skills		.agents/skills
.github/workflows		.github/workflows
benchmarks		benchmarks
docs		docs
examples		examples
src		src
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
eslint.config.mts		eslint.config.mts
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsconfig.type-tests.json		tsconfig.type-tests.json
vitest.bench.config.mts		vitest.bench.config.mts
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Why ColQL?

Install

Quick Example

Index Lifecycle

Performance Snapshot

When To Use ColQL

Decision Guide

Examples

Documentation

Common APIs

Error Handling

Development

Status

Limitations

About

Uh oh!

Releases 9

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Why ColQL?

Install

Quick Example

Index Lifecycle

Performance Snapshot

When To Use ColQL

Decision Guide

Examples

Documentation

Common APIs

Error Handling

Development

Status

Limitations

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 9

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages