Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 9 additions & 3 deletions .github/workflows/test-eql.yml
Original file line number Diff line number Diff line change
Expand Up @@ -120,9 +120,15 @@ jobs:
- name: Regenerate and verify the matrix test-name inventory
run: |
mise run test:matrix:inventory
git diff --exit-code -- tests/sqlx/snapshots/int4_matrix_tests.txt \
tests/sqlx/snapshots/int2_matrix_tests.txt \
|| { echo "Coverage inventory stale — run 'mise run test:matrix:inventory' and commit."; exit 1; }
# `git add -N` registers any brand-new untracked snapshot (e.g. a new
# scalar type whose baseline was never committed) so a forgotten
# commit also trips the diff — `git diff --exit-code` ignores wholly
# untracked files otherwise. Diff the whole snapshots/ directory so no
# per-type file is hardcoded here; the mise task already enumerates the
# type set from the manifests and fails on a missing/stale snapshot.
git add -N tests/sqlx/snapshots
git diff --exit-code -- tests/sqlx/snapshots \
|| { echo "Coverage inventory stale or uncommitted — run 'mise run test:matrix:inventory' and commit tests/sqlx/snapshots."; exit 1; }

test:
name: "Test & Validate EQL (Postgres ${{ matrix.postgres-version }})"
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ Each entry that ships in a published release links to the PR that introduced it.

- **`eql_v3` encrypted-domain schema, with the `int4` family as its first member.** Encrypted-domain type families now live in a new, additional `eql_v3` schema (the existing `eql_v2` schema is unchanged — it keeps the core types/operators and stays the documented public API). Four jsonb-backed domains for encrypted `int4` columns: `eql_v3.int4` (storage-only), `eql_v3.int4_eq` (`=` / `<>` via HMAC), and `eql_v3.int4_ord` / `eql_v3.int4_ord_ore` (also `<` `<=` `>` `>=` via ORE block terms). Supported comparisons resolve to inlinable wrappers; the native `jsonb` operator surface reachable through domain fallback is blocked (raises rather than silently mis-resolving). Each domain's `CHECK` requires the EQL envelope (`v`, `i`), the ciphertext (`c`), and the variant's index term(s), and pins the payload version (`VALUE->>'v' = '2'`, matching `eql_v2._encrypted_check_v`) — so a missing key or wrong-version payload is rejected on insert or cast rather than surfacing later at query time. Index via a functional index on the `eql_v3.eq_term` / `eql_v3.ord_term` extractors, not an operator class on the domain. The extractors still return the core `eql_v2.hmac_256` / `eql_v2.ore_block_u64_8_256` index-term types, which remain in `eql_v2` and are referenced cross-schema. Why: a type-safe, per-capability encrypted integer column instead of the untyped `eql_v2_encrypted`, namespaced under its own schema. This is the reference scalar implementation for the generated domain family. ([#239](https://github.com/cipherstash/encrypt-query-language/pull/239), supersedes [#225](https://github.com/cipherstash/encrypt-query-language/pull/225))
- **`eql_v3.int2` encrypted-domain type family.** Four jsonb-backed domains for encrypted `int2` columns — `eql_v3.int2` (storage-only), `eql_v3.int2_eq` (`=` / `<>` via HMAC), and `eql_v3.int2_ord` / `eql_v3.int2_ord_ore` (also `<` `<=` `>` `>=` via ORE block terms, with `MIN` / `MAX` aggregates) — generated from `tasks/codegen/types/int2.toml` by the same materializer as the `eql_v3.int4` reference. Index via a functional index on the `eql_v3.eq_term` / `eql_v3.ord_term` extractors, not an operator class on the domain. Why: a type-safe, per-capability encrypted `smallint` column, proving the scalar generator generalizes beyond the `int4` reference. ([#243](https://github.com/cipherstash/encrypt-query-language/pull/243))
- **`eql_v3.int8` encrypted-domain type family.** Four jsonb-backed domains for encrypted `int8` (Postgres `bigint` / Rust `i64`) columns — `eql_v3.int8` (storage-only), `eql_v3.int8_eq` (`=` / `<>` via HMAC), and `eql_v3.int8_ord` / `eql_v3.int8_ord_ore` (also `<` `<=` `>` `>=` via ORE block terms, with `MIN` / `MAX` aggregates) — generated from `tasks/codegen/types/int8.toml` by the same materializer as the `eql_v3.int4` reference. Index via a functional index on the `eql_v3.eq_term` / `eql_v3.ord_term` extractors, not an operator class on the domain. Why: a type-safe, per-capability encrypted 64-bit integer column for callers whose values exceed the `int4` range. ([#244](https://github.com/cipherstash/encrypt-query-language/pull/244))
- **Per-domain `MIN` / `MAX` aggregates for the encrypted-domain family.** `eql_v3.min(eql_v3.<T>_ord)` / `eql_v3.max(eql_v3.<T>_ord)` (and the `_ord_ore` twin) are generated for every ord-capable scalar variant, giving type-safe extrema on domain-typed columns — comparison routes through the variant's `<` / `>` operator (ORE block term, no decryption). The aggregates are declared `PARALLEL = SAFE` with a combine function (the state function itself — min/max are associative), so PostgreSQL can use partial/parallel aggregation on large `GROUP BY` workloads. Why: the new domain types previously had no equivalent of the composite-type aggregates. The existing `eql_v2.min(eql_v2_encrypted)` / `eql_v2.max(eql_v2_encrypted)` aggregates are **retained** and continue to work on `eql_v2_encrypted` columns; the per-domain aggregates are additive and coexist with them. ([#239](https://github.com/cipherstash/encrypt-query-language/pull/239))

## [2.3.1] — 2026-05-21
Expand Down
63 changes: 44 additions & 19 deletions docs/reference/encrypted-domain-implementation-spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,12 +92,38 @@ future migration.
by the CI staleness check (`mise run codegen:domain <T>` + `git diff
--exit-code`) and the `<T>` cases in `tasks/codegen/test_scalars.py`, and
the `ordered_numeric_matrix!` SQLx suite (behaviour, not bytes).
- [ ] Wire the SQLx matrix oracle. The generated SQL is enough to install the
domains, but the `ordered_numeric_matrix!` suite only runs once the Rust
harness knows about the scalar. Copy each piece from the `int4`/`int8`
reference — six files, each a small registration:

| File | Add |
|------|-----|
| `tests/sqlx/src/fixtures/eql_plaintext.rs` | A sealed `EqlPlaintext` impl for the scalar's Rust type: `impl Sealed for <R> {}`, a `PlaintextSqlType` const for its base column type, `impl EqlPlaintext for <R>` (`CAST`, `PLAINTEXT_SQL_TYPE`, `to_plaintext` → the right `Plaintext` variant), plus the two `#[test]` casts. |
| `tests/sqlx/src/fixtures/eql_v2_<T>.rs` | `crate::scalar_fixture!("eql_v2_<T>", <R>, VALUES);` (pulls `super::<T>_values::VALUES`). |
| `tests/sqlx/src/fixtures/mod.rs` | `pub mod <T>_values;` and `pub mod eql_v2_<T>;`. |
| `tests/sqlx/src/scalar_domains.rs` | `impl ScalarType for <R>` — `PG_TYPE` (the base PG type, e.g. `"int8"`) and `FIXTURE_VALUES = crate::fixtures::<T>_values::VALUES`. |
| `tests/sqlx/tests/encrypted_domain/scalars/<T>.rs` | `ordered_numeric_matrix! { suite = <T>, scalar = <R>, eql_type = "eql_v2_<T>" }`. |
| `tests/sqlx/tests/encrypted_domain/scalars/mod.rs` | `pub mod <T>;`. |

`<R>` is the scalar's Rust type (`i32` for `int4`, `i64` for `int8`). The
two `mod.rs` declarations and the `ScalarType` / `EqlPlaintext` impls are
hand-maintained registration lists: forget one and the matrix simply does
not run for the type (the inventory snapshot in the next step is the guard
that surfaces it).
- [ ] Run `mise run test:matrix:inventory` and commit the regenerated
`tests/sqlx/snapshots/<T>_matrix_tests.txt` — the sorted inventory of every
`scalars::<T>::*` test name in the `encrypted_domain` binary. CI diffs it
(same as `<T>_values.rs`); a stale snapshot fails the `matrix-coverage`
job with "Coverage inventory stale". This baseline is what catches a
silently dropped, renamed, or `#[cfg]`-gated matrix test. See §8.
`scalars::<T>::*` test name in the `encrypted_domain` binary. **You do not
edit `mise.toml` or `.github/workflows/test-eql.yml` for this** (#249): the
task enumerates every manifest with a `[fixture]` table and the CI job
diffs the whole `snapshots/` directory, so authoring the `<T>.toml`
manifest is enough for the new snapshot to be generated and gated. The
task fails if a `[fixture]` manifest produces no `scalars::<T>::*` tests
(oracle not wired — see the previous step) or if a snapshot has no manifest
(stale, from a removed type). A stale or uncommitted snapshot fails the CI
`matrix-coverage` job with "Coverage inventory stale or uncommitted". This
baseline is what catches a silently dropped, renamed, or `#[cfg]`-gated
matrix test. See §8 and `tests/sqlx/snapshots/README.md`.
- [ ] Run `mise run test:codegen`, the relevant SQLx suites, and the
PostgreSQL matrix before merging.

Expand Down Expand Up @@ -283,21 +309,20 @@ the catalog does not promise.

### Matrix coverage inventory snapshot

The *set of test names* the matrix emits is itself guarded. `mise run
test:matrix:inventory` lists every test in the `encrypted_domain` binary
under a pinned feature set (`--no-default-features`, which deliberately
excludes the `scale` arm — see the task comment in `mise.toml`), greps it to
each `scalars::<T>::*` matrix, `LC_ALL=C sort`s for byte-stable ordering, and
writes one committed snapshot per scalar at
`tests/sqlx/snapshots/<T>_matrix_tests.txt`. The CI `matrix-coverage` job
regenerates with the same feature set and `git diff --exit-code`s every
snapshot; a divergence fails with "Coverage inventory stale". This is the
guard that catches a silently dropped, renamed, or `#[cfg]`-gated matrix
test — a behaviour the SQLx assertions above cannot see, because a deleted
test simply stops running. When you add a scalar you add a new snapshot;
when you add or remove matrix tests you regenerate and commit the affected
snapshot in the same change. The files are a committed test baseline, **not**
gitignored generated SQL. See `tests/sqlx/snapshots/README.md`.
The *set of test names* the matrix emits is itself guarded by one committed
snapshot per scalar at `tests/sqlx/snapshots/<T>_matrix_tests.txt` — the sorted
inventory of every `scalars::<T>::*` test name. This is the guard that catches a
silently dropped, renamed, or `#[cfg]`-gated matrix test, a behaviour the SQLx
assertions above cannot see (a deleted test simply stops running). The snapshots
are a committed test baseline, **not** gitignored generated SQL.

`mise run test:matrix:inventory` regenerates them and the CI `matrix-coverage`
job gates them; both enumerate the scalar set from the `[fixture]` manifests in
`tasks/codegen/types/` rather than a hand-maintained list, so adding `<T>.toml`
is enough — no task or workflow edit. **`tests/sqlx/snapshots/README.md` is the
source of truth** for the mechanics (pinned feature set, the manifest⇄snapshot
reconciliation, the CI diff, and when to regenerate); see it rather than
duplicating the detail here.

## 9. Fixtures

Expand Down
70 changes: 57 additions & 13 deletions mise.toml
Original file line number Diff line number Diff line change
Expand Up @@ -99,27 +99,71 @@ mise exec python -- python -m pytest tasks/codegen -q
"""

[tasks."test:matrix:inventory"]
description = "Regenerate the int4/int2 matrix test-name inventory snapshots (no database required)"
description = "Regenerate the per-scalar matrix test-name inventory snapshots from type manifests (no database required)"
dir = "{{config_root}}/tests/sqlx"
run = """
# Regenerate one committed snapshot per scalar encrypted-domain type, enumerated
# dynamically from the type manifests — the SAME source of truth that
# `fixture:generate:all` and `codegen:domain:all` use. Adding a new scalar type
# (a new <T>.toml with a [fixture] table) is picked up automatically; this task
# never hand-lists a type. The manifests live at the repo root, but this task
# runs in tests/sqlx (dir = .../tests/sqlx), so the glob is ../../tasks/...
#
# Pin an explicit feature set so the inventory is deterministic regardless of
# the caller's local flags. `--no-default-features` keeps the `scale` arm
# (`#[cfg(feature = "scale")]`) excluded — its add/delete is a known blind spot
# of this default-feature inventory, covered instead by the scale gate + the
# family::mutations negative controls. `--list` enumerates the whole
# encrypted_domain binary (family::support, family::inlinability,
# family::mutations, scalars::int4, scalars::int2); the per-scalar `grep`
# scopes each snapshot to that matrix only, so landing other family tests
# never dirties it. `LC_ALL=C sort` makes ordering byte-stable across locales
# (a bare `sort` is locale-dependent and yields spurious CI diffs).
# family::mutations, scalars::<T> for each scalar); the per-scalar `grep` scopes
# each snapshot to that matrix only, so landing other family tests never dirties
# it. `LC_ALL=C sort` makes ordering byte-stable across locales (a bare `sort`
# is locale-dependent and yields spurious CI diffs). `--list` is run ONCE and
# reused — its output is deterministic, so the single invocation is faithful and
# avoids three redundant cargo passes.
set -euo pipefail
mkdir -p snapshots
cargo test --no-default-features --test encrypted_domain -- --list |
sed -n 's/: test$//p' |
grep '^scalars::int4' |
LC_ALL=C sort > snapshots/int4_matrix_tests.txt
cargo test --no-default-features --test encrypted_domain -- --list |
sed -n 's/: test$//p' |
grep '^scalars::int2' |
LC_ALL=C sort > snapshots/int2_matrix_tests.txt

# One enumeration pass for the whole binary, reused per type below.
listing=$(cargo test --no-default-features --test encrypted_domain -- --list | sed -n 's/: test$//p')

generated=0
for manifest in ../../tasks/codegen/types/*.toml; do
# Guard the no-match case (glob stays literal under POSIX sh).
[ -e "$manifest" ] || continue
# Only types that declare a [fixture] table participate in the matrix suite.
grep -qE '^\\[fixture\\]' "$manifest" || continue
token=$(basename "$manifest" .toml)
printf '%s\\n' "$listing" |
grep "^scalars::${token}::" |
LC_ALL=C sort > "snapshots/${token}_matrix_tests.txt"
# A manifest with a [fixture] table MUST have matching matrix tests. An empty
# snapshot means the scalars::<token> module is missing or mis-named — fail
# loudly rather than committing an empty baseline.
if [ ! -s "snapshots/${token}_matrix_tests.txt" ]; then
echo "No 'scalars::${token}::*' tests found for manifest ${manifest} — matrix suite missing or mis-named." >&2
exit 1
fi
generated=$((generated + 1))
done

if [ "$generated" -eq 0 ]; then
echo "No scalar manifests with a [fixture] table found in ../../tasks/codegen/types/" >&2
exit 1
fi

# Reconcile the other direction: every committed snapshot must map to a manifest
# with a [fixture] table. A stale snapshot left behind when a scalar type is
# removed would otherwise silently survive (git diff sees no change to it).
for snapshot in snapshots/*_matrix_tests.txt; do
[ -e "$snapshot" ] || continue
token=$(basename "$snapshot" _matrix_tests.txt)
manifest="../../tasks/codegen/types/${token}.toml"
if [ ! -f "$manifest" ] || ! grep -qE '^\\[fixture\\]' "$manifest"; then
echo "Stale snapshot ${snapshot}: no type manifest with a [fixture] table at ${manifest}. Remove the snapshot or restore the manifest." >&2
exit 1
fi
done

echo "Regenerated ${generated} matrix inventory snapshot(s)."
"""
9 changes: 9 additions & 0 deletions tasks/codegen/scalars.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,15 @@ def render_literal(self, value: str) -> str:
min_value=-32768,
max_value=32767,
),
"int8": ScalarKind(
token="int8",
rust_type="i64",
min_symbol="i64::MIN",
max_symbol="i64::MAX",
zero_symbol="0",
min_value=-9223372036854775808,
max_value=9223372036854775807,
),
}


Expand Down
18 changes: 18 additions & 0 deletions tasks/codegen/test_scalars.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,3 +80,21 @@ def test_int2_kind_rejects_out_of_range():
kind = require_scalar("int2")
with pytest.raises(ScalarError, match="out of range"):
kind.numeric_value("40000")


def test_int8_kind_resolves_and_renders():
kind = require_scalar("int8")
assert kind.rust_type == "i64"
assert kind.numeric_value("MIN") == -9223372036854775808
assert kind.numeric_value("MAX") == 9223372036854775807
assert kind.numeric_value("ZERO") == 0
assert kind.render_literal("MIN") == "i64::MIN"
assert kind.render_literal("MAX") == "i64::MAX"
assert kind.render_literal("ZERO") == "0"
assert kind.render_literal("5000000000") == "5000000000"


def test_int8_kind_rejects_out_of_range():
kind = require_scalar("int8")
with pytest.raises(ScalarError, match="out of range"):
kind.numeric_value("9223372036854775808") # i64::MAX + 1
6 changes: 3 additions & 3 deletions tasks/codegen/test_spec.py
Original file line number Diff line number Diff line change
Expand Up @@ -199,10 +199,10 @@ def test_fixture_values_reject_sentinel_literal_alias(tmp_path):
def test_fixture_for_unknown_scalar_token_raises(tmp_path):
bad = textwrap.dedent("""
[domain]
int8 = []
bogus = []

[fixture]
values = ["1"]
""")
with pytest.raises(SpecError, match="unknown scalar token 'int8'"):
load_spec(write(tmp_path, "int8.toml", bad))
with pytest.raises(SpecError, match="unknown scalar token 'bogus'"):
load_spec(write(tmp_path, "bogus.toml", bad))
20 changes: 20 additions & 0 deletions tasks/codegen/types/int8.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Encrypted-domain scalar manifest for int8.
# The filename supplies the type token. Each domain lists the index terms
# it carries; term capabilities are fixed in tasks/codegen/terms.py.

[domain]
int8 = []
int8_eq = ["hm"]
int8_ord_ore = ["ore"]
int8_ord = ["ore"]

# Single source of truth for the int8 fixture plaintext list. Drives the
# generated tests/sqlx/src/fixtures/int8_values.rs const, shared by the fixture
# generator and the matrix oracle. Sentinels MIN/MAX/ZERO map to i64 named
# consts; the set MUST include MIN, MAX, and zero (matrix comparison pivots).
# Includes values outside the int4 range (|x| > 2^31) to exercise 64-bit width.
[fixture]
values = [
"MIN", "-5000000000", "-100", "-1", "ZERO", "1", "2", "5", "10", "17", "25",
"42", "50", "100", "250", "1000", "9999", "5000000000", "MAX",
]
Loading