diff --git a/README.md b/README.md index 3d1b471..d888f88 100644 --- a/README.md +++ b/README.md @@ -5,71 +5,120 @@ [![npm downloads](https://img.shields.io/npm/dm/@colql/colql.svg)](https://www.npmjs.com/package/@colql/colql) [![license](https://img.shields.io/npm/l/@colql/colql.svg)](LICENSE) -ColQL is a memory-efficient, indexed, mutable in-memory columnar query engine for TypeScript. It stores data in compact columns, runs lazy queries, validates inputs at runtime, and exposes explicit indexes and mutation APIs without adding runtime dependencies. +ColQL is a zero-dependency, in-memory columnar query engine for TypeScript apps that need compact process-local storage, typed schemas, explicit indexes, and safe mutations. + +It is not a SQL database or persistence layer. ColQL is for data you already want to keep inside a Node.js process. + +## Why ColQL? + +- Compact columnar storage backed by typed arrays, dictionaries, and bit-packed booleans +- Lazy queries with filtering, projection, aggregation, streaming, limit, and offset +- Object predicates plus tuple-style `where(column, operator, value)` +- Explicit equality indexes and sorted numeric indexes for hot predicates +- Mutable tables with `updateMany` and `deleteMany` +- Runtime validation with structured `ColQLError` codes +- Binary serialization for table data +- TypeScript inference for rows, predicates, projections, and mutation payloads +- Zero runtime dependencies + +## Install + +```sh +npm install @colql/colql +``` ## Quick Example ```ts -import { table, column } from "@colql/colql"; +import { column, table } from "@colql/colql"; const users = table({ id: column.uint32(), age: column.uint8(), - status: column.dictionary(["active", "passive"] as const), - is_active: column.boolean(), + status: column.dictionary(["active", "passive", "archived"] as const), + score: column.float64(), + verified: column.boolean(), }); -users.insert({ - id: 1, - age: 25, - status: "active", - is_active: true, -}); +users.insertMany([ + { id: 1, age: 29, status: "active", score: 91.5, verified: true }, + { id: 2, age: 17, status: "passive", score: 72.0, verified: false }, + { id: 3, age: 44, status: "active", score: 88.2, verified: true }, +]); -users.updateWhere("id", "=", 1, { age: 26 }); +users.createIndex("status"); +users.createSortedIndex("age"); const activeAdults = users - .where("age", ">=", 18) - .where("status", "=", "active") - .select(["id", "age"]) - .limit(5) + .where({ + status: "active", + age: { gte: 18 }, + }) + .select(["id", "age", "score"]) .toArray(); -``` -## Mutations and Indexes +const result = users.updateMany( + { status: "passive", age: { lt: 18 } }, + { status: "archived" }, +); -```ts -users.update(0, { status: "active" }); -users.updateWhere("status", "=", "passive", { status: "active" }); -users.deleteWhere("age", "<", 18); +console.log(activeAdults); +console.log(result.affectedRows); +``` -users.createIndex("status"); -users.createSortedIndex("age"); +## Performance Snapshot -const activeUsers = users.where("status", "=", "active").toArray(); -const adults = users.where("age", ">=", 18).toArray(); -``` +ColQL includes a Fastify example that can boot with 1M deterministic rows and exercise indexed, range, scan, callback-filter, mutation, stress, and memory paths. -Predicate mutations return `{ affectedRows: number }`. The older `delete(rowIndex)` API physically removes one row and returns the table instance. +These are local reference numbers from `examples/fastify-api` after the example's mutation validation path, not guarantees. Actual numbers vary by Node.js version, CPU, memory pressure, data distribution, query selectivity, projection size, and mutation frequency. -## Install +| 1M-row workload | Local avg | Expected shape | +|---|---:|---| +| Selective equality query with `createIndex()` | 2.06ms | Fastest path when candidate sets are small | +| Numeric range query with `createSortedIndex()` | 27.69ms | Helps selective ranges; broad ranges may resemble scans | +| Broad structured predicate | 25.31ms | May intentionally scan when an index is not selective | +| `filter(fn)` callback predicate | 218.93ms | Full-scan escape hatch; not index-aware | + +Run the example locally: ```sh -npm install @colql/colql +cd examples/fastify-api +npm install +npm run test:large ``` -## Highlights +For benchmark scripts and interpretation notes, see [Performance and Benchmarks](./docs/doc/13-performance-and-benchmarks.md). -- Chunked columnar storage backed by `TypedArray`, dictionary, and bit-packed boolean columns -- Lazy filtering, projection, aggregation, and streaming iteration -- Physical deletes with no tombstones or compact step -- Row updates plus predicate-based update/delete -- Optional equality indexes and sorted indexes -- Cost-aware query planning with lazy index rebuilds after mutations -- Runtime validation with structured `ColQLError` errors -- Binary serialization of schema and column data -- TypeScript inference for rows, predicates, projections, and mutation payloads -- Zero runtime dependencies +## When To Use ColQL + +Use ColQL when: + +- you need to keep thousands to millions of records in memory +- JavaScript object arrays use too much memory +- filters and aggregations should avoid intermediate arrays +- a TypeScript schema can describe your columns +- explicit indexes are acceptable for hot equality or range predicates +- runtime validation matters because data may come from untyped sources + +Avoid ColQL when: + +- you need durable storage, transactions, joins, or SQL +- row indexes must be stable external identifiers +- every query requires arbitrary sorting or grouping +- you need concurrent writers or multi-process coordination +- you want automatic indexes, compound indexes, or query planning across tables + +Row indexes are physical positions and can change after deletes. Use an explicit `id` column for stable identity. + +## Examples + +- [Basic usage](./examples/basic.ts) +- [Aggregation](./examples/aggregation.ts) +- [Streaming](./examples/streaming.ts) +- [Serialization](./examples/serialization.ts) +- [Fastify API with 1M-row validation](./examples/fastify-api) + +The Fastify example demonstrates HTTP query params mapped to object predicates, range queries, `filter(fn)`, `updateMany`, `deleteMany`, query diagnostics, index stats, and memory counters. ## Documentation @@ -78,30 +127,16 @@ Detailed documentation is available under [`docs/doc`](./docs/doc). Recommended reading: - [Overview](./docs/doc/00-overview.md) +- [Installation](./docs/doc/01-installation.md) - [Schema and Columns](./docs/doc/02-schema-and-columns.md) - [Querying](./docs/doc/04-querying.md) -- [Indexing](./docs/doc/06-indexing.md) +- [Equality Indexes](./docs/doc/06-indexing.md) +- [Sorted Indexes](./docs/doc/07-sorted-indexes.md) - [Mutations](./docs/doc/08-mutations.md) -- [Error Handling](./docs/doc/10-error-handling.md) +- [Serialization](./docs/doc/11-serialization.md) - [Memory Model](./docs/doc/12-memory-model.md) - -The full documentation set also covers installation, inserts, aggregations, sorted indexes, physical deletes, serialization, benchmarks, TypeScript type safety, limitations, and a compact API reference. - -## Error Handling - -ColQL validates schemas, inserted rows, query predicates, mutation payloads, indexes, and serialized input at runtime. Failures throw `ColQLError` with a stable `code`, a message, and optional details. - -```ts -import { ColQLError } from "@colql/colql"; - -try { - users.insert({ id: 2, age: 999, status: "active", is_active: true }); -} catch (error) { - if (error instanceof ColQLError) { - console.log(error.code); - } -} -``` +- [Limitations and Design Decisions](./docs/doc/15-limitations-and-design-decisions.md) +- [API Reference](./docs/doc/16-api-reference.md) ## Common APIs @@ -109,18 +144,18 @@ try { users.insert(row); users.insertMany(rows); +users.where({ status: "active", age: { gte: 18 } }).toArray(); users.where("age", ">=", 18).select(["id"]).toArray(); -users.whereIn("status", ["active"]); -users.whereNotIn("status", ["archived"]); +users.whereIn("status", ["active", "passive"]); +users.filter((row) => row.score > 90); users.count(); users.avg("age"); users.top(10, "score"); users.update(0, { status: "active" }); -users.updateWhere("status", "=", "passive", { status: "active" }); -users.delete(0); -users.deleteWhere("age", "<", 18); +users.updateMany({ status: "passive" }, { status: "active" }); +users.deleteMany({ status: "archived" }); users.createIndex("id"); users.createSortedIndex("age"); @@ -129,6 +164,24 @@ const buffer = users.serialize(); const restored = table.deserialize(buffer); ``` +`filter(fn)` is intentionally a full-scan escape hatch. Prefer structured predicates when you want index planning. + +## Error Handling + +ColQL validates schemas, inserted rows, query predicates, mutation payloads, indexes, and serialized input at runtime. Failures throw `ColQLError` with a stable `code`, a message, and optional details. + +```ts +import { ColQLError } from "@colql/colql"; + +try { + users.insert({ id: 4, age: 300, status: "active", score: 1, verified: true }); +} catch (error) { + if (error instanceof ColQLError) { + console.log(error.code); // COLQL_OUT_OF_RANGE + } +} +``` + ## Development ```sh @@ -146,8 +199,8 @@ npm run benchmark:delete ## Status -ColQL v0.1.x aims to keep the public API reasonably stable, but breaking changes may still happen before 1.0.0. +ColQL v0.2.x aims to keep the public API reasonably stable, but breaking changes may still happen before 1.0.0. ## Limitations -ColQL intentionally does not include SQL parsing, joins, transactions, concurrency control, automatic indexes, compound indexes, or durable storage. Row indexes are not stable after physical deletes; use an explicit `id` column for stable identity. +ColQL intentionally does not include SQL parsing, joins, transactions, concurrency control, automatic indexes, compound indexes, or durable storage. Indexes are derived performance structures; query results must be the same whether ColQL uses an index or a full scan. diff --git a/examples/fastify-api/README.md b/examples/fastify-api/README.md index d259926..8b84b76 100644 --- a/examples/fastify-api/README.md +++ b/examples/fastify-api/README.md @@ -1,33 +1,104 @@ # ColQL Fastify API Example -Small process-local Fastify backend showing ColQL v0.2.0 in an HTTP API. +A minimal Fastify backend demonstrating how to use ColQL as a process-local data layer in a real HTTP API. -The app stores users in memory. It is useful as an integration example, not as a persistent database-backed service. Restarting the process resets the data. +The app stores users in memory. It is an integration example, not a persistent database-backed service. Restarting the process resets the data. + +## What This Demonstrates + +- HTTP query params mapped to object-based `where({ ... })` +- Range queries with `minAge` and `maxAge` backed by a sorted age index +- `filter(fn)` for callback search after structured predicates +- `insert(row)` and `insertMany(rows)` through HTTP endpoints +- `updateMany(predicate, partialRow)` through `PATCH /users/by-country/:country` +- `deleteMany(predicate)` through `DELETE /users/inactive` +- Equality indexes with `createIndex("country")` and `createIndex("name")` +- Sorted indexes with `createSortedIndex("age")` +- Query diagnostics through `onQuery` +- Public counters such as `rowCount`, `capacity`, `materializedRowCount`, and `scannedRowCount` + +`filter(fn)` is a full-scan escape hatch and is not index-aware. In this example, structured filters run first, then `search` applies a callback filter to remaining rows. ## Run +Install dependencies: + ```sh npm install +``` + +Tiny deterministic seed mode: + +```sh npm run dev ``` The server listens on `http://localhost:3000` by default. Set `PORT` to use another port. -By default the app starts with a tiny deterministic seed. To test a larger process-local dataset, set `COLQL_EXAMPLE_SEED_SIZE`: +1M generated dataset mode: ```sh -COLQL_EXAMPLE_SEED_SIZE=1000000 npm run dev +npm run dev:1m ``` -There is also a shortcut: +Custom generated seed size: ```sh -npm run dev:1m +COLQL_EXAMPLE_SEED_SIZE=100000 npm run dev +``` + +The generated seed uses deterministic dictionary values for `country` and `name`, so it exercises ColQL dictionary columns, equality indexes, and the sorted age index. + +## Try It + +Query with indexed structured filters: + +```sh +curl "http://localhost:3000/users?country=TR&active=true" +``` + +Query a numeric range: + +```sh +curl "http://localhost:3000/users?minAge=30&maxAge=40" ``` -The generated seed uses deterministic dictionary values for `country` and `name`, so it still exercises ColQL dictionary columns, equality indexes, and the sorted age index. +Run a structured filter followed by callback search: + +```sh +curl "http://localhost:3000/users?country=TR&search=mi" +``` -## Test +Update all users in one country: + +```sh +curl -X PATCH "http://localhost:3000/users/by-country/TR" \ + -H "content-type: application/json" \ + -d '{"active":true,"score":99.9}' +``` + +Delete inactive users: + +```sh +curl -X DELETE "http://localhost:3000/users/inactive" +``` + +Inspect query diagnostics: + +```sh +curl "http://localhost:3000/debug/query-log" +``` + +Other useful debug endpoints: + +```sh +curl "http://localhost:3000/debug/indexes" +curl "http://localhost:3000/debug/memory" +``` + +## Test And Validate + +Run normal tests: ```sh npm test @@ -43,15 +114,15 @@ npm run test:large `test:large` starts the app with 1M generated users, performs `updateMany`, `deleteMany`, and `insertMany` through HTTP requests, verifies filtered query correctness after lazy index rebuilds, and prints latency summaries for indexed, range, broad scan, and callback-filter requests. -Run the basic concurrent stress check with: +Run the basic concurrent stress check: ```sh npm run stress ``` -The stress script sends 50 concurrent requests to an index-friendly structured query and checks that all responses are successful and consistent. +The stress script sends concurrent requests to an index-friendly structured query and checks that all responses are successful and consistent. -Run the memory sanity check with: +Run the memory sanity check: ```sh npm run memory:example @@ -59,8 +130,6 @@ npm run memory:example The memory script reports `heapUsed`, `rss`, and `arrayBuffers` after 1M seed, after mutations, and after repeated queries. These scripts do not enforce strict latency or memory thresholds; they are smoke validations for local machines. -`filter(fn)` is a full-scan escape hatch and is not index-aware. On 1M rows, callback-filter requests are expected to be slower than structured indexed queries. - ## Endpoints - `GET /health` @@ -73,25 +142,3 @@ The memory script reports `heapUsed`, `rss`, and `arrayBuffers` after 1M seed, a - `GET /debug/query-log` - `GET /debug/indexes` - `GET /debug/memory` - -## ColQL Features Demonstrated - -- `insert(row)` -- `insertMany(rows)` -- object-based `where({ ... })` -- tuple `where(column, operator, value)` -- `filter(fn)` for callback search -- `select()` projection -- `count()` -- `updateMany(predicate, partialRow)` -- `deleteMany(predicate)` -- equality indexes with `createIndex` -- sorted/range index usage with `createSortedIndex` -- query diagnostics with `onQuery` -- public memory-related counters such as `rowCount`, `capacity`, `materializedRowCount`, and `scannedRowCount` - -Example query: - -```sh -curl "http://localhost:3000/users?country=TR&minAge=25&search=mi" -``` diff --git a/tests/insert-validation.test.ts b/tests/insert-validation.test.ts index cfcf78e..d4ec448 100644 --- a/tests/insert-validation.test.ts +++ b/tests/insert-validation.test.ts @@ -67,6 +67,7 @@ describe("insert validation", () => { it("insertMany leaves existing rows unchanged when any row is invalid", () => { const users = usersTable(); users.insert({ id: 1, age: 20, score: 1, status: "active", is_active: true }); + users.createIndex("id").createIndex("status").createSortedIndex("age"); const before = users.toArray(); expectCode( @@ -79,5 +80,8 @@ describe("insert validation", () => { ); expect(users.toArray()).toEqual(before); + expect(users.where("id", "=", 2).toArray()).toEqual([]); + expect(users.where("status", "=", "passive").toArray()).toEqual([]); + expect(users.where("age", ">=", 20).toArray()).toEqual(before); }); }); diff --git a/tests/mutation.test.ts b/tests/mutation.test.ts index e0f53d5..37ae984 100644 --- a/tests/mutation.test.ts +++ b/tests/mutation.test.ts @@ -76,6 +76,7 @@ describe("mutations", () => { it("updates predicate matches all-or-nothing and snapshots before mutating predicate columns", () => { const users = createUsers(); + users.createIndex("status").createSortedIndex("age"); expect(users.updateWhere("status", "=", "active", { status: "passive", age: 99 })).toEqual({ affectedRows: 4 }); expect(users.where("status", "=", "active").count()).toBe(0); @@ -88,6 +89,8 @@ describe("mutations", () => { /uint8 integer/, ); expect(users.toArray()).toEqual(before); + expect(users.where("status", "=", "passive").toArray()).toEqual(before.filter((row) => row.status === "passive")); + expect(users.where("age", ">=", 99).toArray()).toEqual(before.filter((row) => row.age >= 99)); }); it("updateMany delegates to where(predicate).update()", () => { diff --git a/tests/query-correctness-parity.test.ts b/tests/query-correctness-parity.test.ts new file mode 100644 index 0000000..8ab6ca5 --- /dev/null +++ b/tests/query-correctness-parity.test.ts @@ -0,0 +1,160 @@ +import { describe, expect, it } from "vitest"; +import { column, table, type RowForSchema } from "../src"; + +const schema = { + id: column.uint32(), + age: column.uint8(), + score: column.uint32(), + status: column.dictionary(["active", "passive", "archived"] as const), + active: column.boolean(), +}; + +type User = RowForSchema; + +function seedRows(count = 80): User[] { + return Array.from({ length: count }, (_unused, id) => ({ + id, + age: (id * 7) % 90, + score: (id * 13) % 1_000, + status: id % 3 === 0 ? "active" : id % 3 === 1 ? "passive" : "archived", + active: id % 4 !== 0, + })); +} + +function createUsers(rows = seedRows()) { + const users = table(schema); + users.insertMany(rows); + return users; +} + +function ids(rows: readonly User[]): number[] { + return rows.map((row) => row.id); +} + +describe("query correctness parity", () => { + it("returns identical equality-index and scan results in logical row order", () => { + const scan = createUsers(); + const indexed = createUsers(); + indexed.createIndex("id").createIndex("status"); + + expect(indexed.where("status", "=", "active").toArray()).toEqual(scan.where("status", "=", "active").toArray()); + expect(indexed.where("id", "in", [3, 25, 72]).toArray()).toEqual(scan.where("id", "in", [3, 25, 72]).toArray()); + expect(indexed.where("status", "=", "archived").where("age", ">=", 40).toArray()).toEqual( + scan.where("status", "=", "archived").where("age", ">=", 40).toArray(), + ); + expect(indexed.where("id", "=", 999).toArray()).toEqual([]); + expect(ids(indexed.where("status", "=", "active").toArray())).toEqual(ids(seedRows().filter((row) => row.status === "active"))); + }); + + it("returns identical sorted-index and scan results for ranges", () => { + const scan = createUsers(); + const indexed = createUsers(); + indexed.createSortedIndex("age").createSortedIndex("score"); + + expect(indexed.where("age", ">", 70).toArray()).toEqual(scan.where("age", ">", 70).toArray()); + expect(indexed.where("age", ">=", 35).toArray()).toEqual(scan.where("age", ">=", 35).toArray()); + expect(indexed.where("score", "<", 100).toArray()).toEqual(scan.where("score", "<", 100).toArray()); + expect(indexed.where("score", "<=", 250).where("status", "!=", "archived").toArray()).toEqual( + scan.where("score", "<=", 250).where("status", "!=", "archived").toArray(), + ); + expect(indexed.where("age", ">", 250).toArray()).toEqual([]); + }); + + it("returns identical results with equality and sorted indexes together", () => { + const scan = createUsers(); + const indexed = createUsers(); + indexed.createIndex("id").createIndex("status").createSortedIndex("age").createSortedIndex("score"); + + const expected = scan.where("status", "=", "passive").where("age", ">=", 30).where("score", "<", 700).toArray(); + const actual = indexed.where("score", "<", 700).where("age", ">=", 30).where("status", "=", "passive").toArray(); + + expect(actual).toEqual(expected); + expect(ids(actual)).toEqual([...ids(actual)].sort((left, right) => left - right)); + }); + + it("keeps dirty indexes correct after updateMany, deleteMany, and insertMany", () => { + const rows = seedRows(); + const users = createUsers(rows); + users.createIndex("id").createIndex("status").createSortedIndex("age").createSortedIndex("score"); + + const oracle = rows.map((row) => ({ ...row })); + + const updateStatus = users.updateMany({ status: "passive" }, { status: "active" }); + for (const row of oracle) { + if (row.status === "passive") { + row.status = "active"; + } + } + expect(updateStatus.affectedRows).toBe(27); + expect(users.where("status", "=", "active").toArray()).toEqual(oracle.filter((row) => row.status === "active")); + + const updateAge = users.updateMany({ id: { in: [5, 12, 47] } }, { age: 88 }); + for (const row of oracle) { + if ([5, 12, 47].includes(row.id)) { + row.age = 88; + } + } + expect(updateAge.affectedRows).toBe(3); + expect(users.where("age", ">=", 80).toArray()).toEqual(oracle.filter((row) => row.age >= 80)); + + const deleted = users.deleteMany({ status: "archived", age: { lt: 50 } }); + for (let index = oracle.length - 1; index >= 0; index -= 1) { + if (oracle[index].status === "archived" && oracle[index].age < 50) { + oracle.splice(index, 1); + } + } + expect(deleted.affectedRows).toBeGreaterThan(0); + expect(users.where("status", "=", "archived").where("age", "<", 50).toArray()).toEqual([]); + expect(users.where("status", "=", "active").toArray()).toEqual(oracle.filter((row) => row.status === "active")); + + const inserted: User[] = [ + { id: 1_001, age: 21, score: 901, status: "active", active: true }, + { id: 1_002, age: 77, score: 902, status: "archived", active: false }, + ]; + users.insertMany(inserted); + oracle.push(...inserted); + + expect(users.where("id", "in", [1_001, 1_002]).toArray()).toEqual(inserted); + expect(users.where("age", ">=", 75).toArray()).toEqual(oracle.filter((row) => row.age >= 75)); + }); + + it("returns deterministic results across repeated equivalent queries", () => { + const users = createUsers(); + users.createIndex("status").createSortedIndex("age"); + + const query = () => users.where("status", "=", "active").where("age", ">=", 30).where("age", "<", 80).toArray(); + const first = query(); + + expect(query()).toEqual(first); + expect(query()).toEqual(first); + expect(users.where("age", "<", 80).where("status", "=", "active").where("age", ">=", 30).toArray()).toEqual(first); + }); + + it("keeps mutation and query sequences equal to a plain array oracle", () => { + const users = createUsers(seedRows(120)); + users.createIndex("status").createIndex("id").createSortedIndex("age"); + const oracle = seedRows(120).map((row) => ({ ...row })); + + users.updateMany({ status: "active", age: { gte: 40 } }, { score: 999 }); + for (const row of oracle) { + if (row.status === "active" && row.age >= 40) { + row.score = 999; + } + } + + users.deleteMany({ active: false, age: { lt: 30 } }); + for (let index = oracle.length - 1; index >= 0; index -= 1) { + if (!oracle[index].active && oracle[index].age < 30) { + oracle.splice(index, 1); + } + } + + users.insertMany([{ id: 500, age: 50, score: 500, status: "passive", active: true }]); + oracle.push({ id: 500, age: 50, score: 500, status: "passive", active: true }); + + const predicate = (row: User) => row.status !== "archived" && row.age >= 35 && [500, 999].includes(row.score); + expect(users.where("status", "not in", ["archived"]).where("age", ">=", 35).where("score", "in", [500, 999]).toArray()).toEqual( + oracle.filter(predicate), + ); + }); +}); diff --git a/tests/query-filter.test.ts b/tests/query-filter.test.ts index 7555eae..b760e19 100644 --- a/tests/query-filter.test.ts +++ b/tests/query-filter.test.ts @@ -36,6 +36,62 @@ describe("query filter", () => { expect(rows.map((row) => row.id)).toEqual([6, 8]); }); + it("runs callback filters after structured predicates when indexes exist", () => { + const users = usersFixture(20); + users.createIndex("status").createSortedIndex("age"); + const seenIds: number[] = []; + + const rows = users + .where({ status: "active", age: { gte: 10 } }) + .filter((row) => { + seenIds.push(row.id); + return row.id < 16; + }) + .toArray(); + + expect(seenIds).toEqual([10, 12, 14, 16, 18]); + expect(rows.map((row) => row.id)).toEqual([10, 12, 14]); + }); + + it("matches plain array filtering for structured predicates and callbacks", () => { + const users = usersFixture(30); + users.createIndex("status").createSortedIndex("age"); + const expected = users.toArray().filter((row) => row.status === "active" && row.age >= 8 && row.id % 4 === 0); + + expect(users.where({ status: "active", age: { gte: 8 } }).filter((row) => row.id % 4 === 0).toArray()).toEqual(expected); + }); + + it("applies filter callbacks before limit and offset windows", () => { + const users = usersFixture(20); + const expected = users + .toArray() + .filter((row) => row.age >= 5 && row.id % 2 === 1) + .slice(2, 5); + + expect(users.where({ age: { gte: 5 } }).filter((row) => row.id % 2 === 1).offset(2).limit(3).toArray()).toEqual(expected); + }); + + it("runs multiple callback filters in order", () => { + const users = usersFixture(10); + const firstSeen: number[] = []; + const secondSeen: number[] = []; + + const rows = users + .filter((row) => { + firstSeen.push(row.id); + return row.id >= 3; + }) + .filter((row) => { + secondSeen.push(row.id); + return row.id < 6; + }) + .toArray(); + + expect(firstSeen).toEqual([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]); + expect(secondSeen).toEqual([3, 4, 5, 6, 7, 8, 9]); + expect(rows.map((row) => row.id)).toEqual([3, 4, 5]); + }); + it("forces a full scan and skips index planning when callback filters are present", () => { const indexed = usersFixture(); indexed.createIndex("id"); diff --git a/tests/query-validation.test.ts b/tests/query-validation.test.ts index 026433b..a84fcca 100644 --- a/tests/query-validation.test.ts +++ b/tests/query-validation.test.ts @@ -35,6 +35,7 @@ describe("query validation", () => { it("validates whereIn and whereNotIn arrays", () => { const users = usersTable(); + expectCode(() => users.where("id", "in", []), "COLQL_TYPE_MISMATCH", /non-empty array/); expectCode(() => users.whereIn("status", []), "COLQL_TYPE_MISMATCH", /non-empty array/); expectCode(() => users.whereIn("status", ["active", "deleted" as "active"]), "COLQL_UNKNOWN_VALUE", /Invalid dictionary value/); expectCode(() => users.whereNotIn("age", [1, 999]), "COLQL_OUT_OF_RANGE", /uint8 integer/); @@ -49,6 +50,12 @@ describe("query validation", () => { expectCode(() => users.where({ status: { gt: "active" } } as never), "COLQL_INVALID_PREDICATE", /dictionary column "status"/); expectCode(() => users.where({ is_active: { lt: true } } as never), "COLQL_INVALID_PREDICATE", /boolean column "is_active"/); expectCode(() => users.where({ status: { in: [] } }), "COLQL_TYPE_MISMATCH", /non-empty array/); + expectCode(() => users.where({ age: undefined }), "COLQL_INVALID_PREDICATE", /at least one column condition/); + expectCode( + () => users.where({ age: { gt: 18 }, status: { between: ["active", "passive"] } } as never), + "COLQL_INVALID_PREDICATE", + /Invalid where predicate operator "between"/, + ); }); it("validates select, limit, offset, and get", () => { diff --git a/tests/query-where.test.ts b/tests/query-where.test.ts index 79f81ac..1808512 100644 --- a/tests/query-where.test.ts +++ b/tests/query-where.test.ts @@ -78,4 +78,19 @@ describe("query where", () => { expect(users.where("age", "in", [17, 30]).count()).toBe(2); expect(users.where("status", "not in", ["blocked"]).count()).toBe(3); }); + + it("supports mixed equality, range, and membership predicates", () => { + const users = usersFixture(); + + expect(users.where({ status: { in: ["active", "passive"] }, age: { gte: 18, lt: 40 }, is_active: true }).toArray()).toEqual([ + { id: 3, age: 30, status: "active", is_active: true }, + ]); + }); + + it("returns no rows for conflicting predicates", () => { + const users = usersFixture(); + + expect(users.where({ age: { gt: 40, lt: 18 } }).toArray()).toEqual([]); + expect(users.where("status", "=", "active").where("status", "=", "blocked").toArray()).toEqual([]); + }); }); diff --git a/tests/randomized-correctness.test.ts b/tests/randomized-correctness.test.ts new file mode 100644 index 0000000..346ecf3 --- /dev/null +++ b/tests/randomized-correctness.test.ts @@ -0,0 +1,141 @@ +import { describe, expect, it } from "vitest"; +import { column, table, type RowForSchema } from "../src"; +import type { Query } from "../src/query"; + +const schema = { + id: column.uint32(), + age: column.uint8(), + score: column.uint32(), + status: column.dictionary(["active", "passive", "archived"] as const), + active: column.boolean(), +}; + +type User = RowForSchema; +type UserQuery = Query; +type PredicateCase = { + readonly apply: (query: UserQuery) => UserQuery; + readonly test: (row: User) => boolean; +}; + +function rng(seed: number): () => number { + let state = seed >>> 0; + return () => { + state = (state * 1_664_525 + 1_013_904_223) >>> 0; + return state / 0x1_0000_0000; + }; +} + +function integer(random: () => number, maxExclusive: number): number { + return Math.floor(random() * maxExclusive); +} + +function sampleRows(random: () => number, count: number): User[] { + const statuses: User["status"][] = ["active", "passive", "archived"]; + return Array.from({ length: count }, (_unused, index) => ({ + id: index, + age: integer(random, 100), + score: integer(random, 1_000), + status: statuses[integer(random, statuses.length)], + active: integer(random, 2) === 0, + })); +} + +function createUsers(rows: readonly User[], mode: "scan" | "equality" | "sorted" | "both") { + const users = table(schema); + users.insertMany(rows); + + if (mode === "equality" || mode === "both") { + users.createIndex("id").createIndex("age").createIndex("score").createIndex("status"); + } + + if (mode === "sorted" || mode === "both") { + users.createSortedIndex("age").createSortedIndex("score"); + } + + return users; +} + +function randomPredicate(random: () => number): PredicateCase { + const statuses: User["status"][] = ["active", "passive", "archived"]; + const choice = integer(random, 8); + + switch (choice) { + case 0: { + const id = integer(random, 220); + return { + apply: (query) => query.where("id", "=", id), + test: (row) => row.id === id, + }; + } + case 1: { + const status = statuses[integer(random, statuses.length)]; + return { + apply: (query) => query.where("status", "=", status), + test: (row) => row.status === status, + }; + } + case 2: { + const age = integer(random, 110); + return { + apply: (query) => query.where("age", ">=", age), + test: (row) => row.age >= age, + }; + } + case 3: { + const age = integer(random, 110); + return { + apply: (query) => query.where("age", "<", age), + test: (row) => row.age < age, + }; + } + case 4: { + const score = integer(random, 1_100); + return { + apply: (query) => query.where("score", ">", score), + test: (row) => row.score > score, + }; + } + case 5: { + const values = [integer(random, 100), integer(random, 100), integer(random, 100)]; + return { + apply: (query) => query.where("age", "in", values), + test: (row) => values.includes(row.age), + }; + } + case 6: { + const first = statuses[integer(random, statuses.length)]; + const second = statuses[integer(random, statuses.length)]; + const values = [...new Set([first, second])]; + return { + apply: (query) => query.where("status", "in", values), + test: (row) => values.includes(row.status), + }; + } + default: { + const active = integer(random, 2) === 0; + return { + apply: (query) => query.where("active", "=", active), + test: (row) => row.active === active, + }; + } + } +} + +describe("randomized query correctness", () => { + it("matches plain array filtering across scan and index modes", () => { + const random = rng(0xC01C); + const rows = sampleRows(random, 160); + + for (let iteration = 0; iteration < 100; iteration += 1) { + const predicateCount = 1 + integer(random, 3); + const predicates = Array.from({ length: predicateCount }, () => randomPredicate(random)); + const expected = rows.filter((row) => predicates.every((predicate) => predicate.test(row))); + + for (const mode of ["scan", "equality", "sorted", "both"] as const) { + const users = createUsers(rows, mode); + const query = predicates.reduce((next, predicate) => predicate.apply(next), users.query()); + expect(query.toArray()).toEqual(expected); + } + } + }); +}); diff --git a/tests/serialization.test.ts b/tests/serialization.test.ts index be0c67b..f731d86 100644 --- a/tests/serialization.test.ts +++ b/tests/serialization.test.ts @@ -95,4 +95,51 @@ describe("serialization", () => { expect(() => table.deserialize(patched)).toThrow(/Unsupported ColQL/); }); + + it("keeps query parity before and after deserialization with recreated indexes", () => { + const users = table({ + id: column.uint32(), + age: column.uint8(), + score: column.uint32(), + status: column.dictionary(["active", "passive", "archived"] as const), + active: column.boolean(), + }); + + for (let id = 0; id < 60; id += 1) { + users.insert({ + id, + age: (id * 5) % 80, + score: id * 10, + status: id % 3 === 0 ? "active" : id % 3 === 1 ? "passive" : "archived", + active: id % 2 === 0, + }); + } + + users.createIndex("status").createSortedIndex("age"); + users.updateMany({ status: "passive" }, { score: 777 }); + users.deleteMany({ active: false, age: { lt: 20 } }); + users.insertMany([ + { id: 101, age: 33, score: 500, status: "active", active: true }, + { id: 102, age: 72, score: 800, status: "archived", active: false }, + ]); + + const expectedRows = users.toArray(); + const expectedQuery = users.where({ status: { in: ["active", "archived"] }, age: { gte: 30, lt: 75 } }).toArray(); + const expectedScoreQuery = users.where("score", "=", 777).toArray(); + const restored = table.deserialize(users.serialize()); + + expect(restored.toArray()).toEqual(expectedRows); + expect(restored.indexes()).toEqual([]); + expect(restored.sortedIndexes()).toEqual([]); + expect(restored.where({ status: { in: ["active", "archived"] }, age: { gte: 30, lt: 75 } }).toArray()).toEqual(expectedQuery); + + restored.createIndex("status"); + expect(restored.where({ status: { in: ["active", "archived"] }, age: { gte: 30, lt: 75 } }).toArray()).toEqual(expectedQuery); + + restored.createSortedIndex("age"); + expect(restored.where({ status: { in: ["active", "archived"] }, age: { gte: 30, lt: 75 } }).toArray()).toEqual(expectedQuery); + + restored.createIndex("score"); + expect(restored.where("score", "=", 777).toArray()).toEqual(expectedScoreQuery); + }); });