Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ Targeting `2.3.0` as a breaking release. Customers re-encrypt their data as part
### Changed

- **`=`, `<>`, `~~` (`LIKE`), `~~*` (`ILIKE`) on `eql_v2_encrypted` are now inlinable SQL functions.** The planner can structurally match these operators against the documented functional indexes (`eql_v2.hmac_256(col)` for equality, `eql_v2.bloom_filter(col)` for `LIKE`/`ILIKE`), so bare-form queries (`WHERE col = $1`) engage the index without per-query rewriting. Previously these operators wrapped multi-branch PL/pgSQL bodies that the planner could not inline, forcing seq scans on Supabase / managed Postgres installations that lack operator-class indexes. ([#193](https://github.com/cipherstash/encrypt-query-language/pull/193), [#196](https://github.com/cipherstash/encrypt-query-language/pull/196))
- **`<`, `<=`, `>`, `>=` on `eql_v2_encrypted` are now inlinable SQL functions.** Same precedent as the `=` inlining above: the operator bodies reduce to `eql_v2.ore_block_u64_8_256(a) <op> eql_v2.ore_block_u64_8_256(b)`, so bare-form range queries (`WHERE col < $1`, `WHERE col > $1`, …) structurally match a functional btree index on `eql_v2.ore_block_u64_8_256(col)` (using the existing `eql_v2.ore_block_u64_8_256_operator_class`). Top-N sorts under `ORDER BY col LIMIT n` still need a Sort node (the natural-form sort key doesn't syntactically match the index expression), but each comparison now uses the inlined ORE-term path rather than a plpgsql `eql_v2.compare()` dispatch. The inner `eql_v2.ore_block_u64_8_256_{eq,neq,lt,lte,gt,gte}` helpers backing the ORE-term type's own operators are now declared `IMMUTABLE STRICT PARALLEL SAFE` and allowlisted in the post-build search-path pin so that the chain inlines cleanly through to index matching. **Behaviour to be aware of:** range queries against columns that carry only `ore_cllw_u64_8` / `ore_cllw_var_8` (CLLW ORE) or OPE terms now raise from the `ore_block_u64_8_256` extractor instead of dispatching through the old `eql_v2.compare()` priority list. Callers in that situation must rewrite to the relevant extractor form (e.g. `WHERE eql_v2.ore_cllw_u64_8(col) < eql_v2.ore_cllw_u64_8($1::jsonb)`) — see [U-005](docs/upgrading/v2.3.md#u-005-range-operators-are-block-ore-only).
Comment thread
coderabbitai[bot] marked this conversation as resolved.
- **`eql_v2.hmac_256(val jsonb)` and `eql_v2.hmac_256(val eql_v2_encrypted)` are now inlinable SQL.** Both 1-arg overloads flipped from plpgsql-with-RAISE to single-statement SQL returning NULL when `hm` is absent. This restores per-row extractor inlining inside the `=` / `<>` operator bodies. **Behaviour to be aware of:** `WHERE col = $1` on a column lacking `hm` now silently returns zero rows where it previously raised — see the amended [U-002](docs/upgrading/v2.3.md#u-002-equality-and-hashing-require-hmac). The loud RAISE-on-missing-hm path is retained in `eql_v2.hash_encrypted`, so `GROUP BY` / `DISTINCT` / hash joins still surface misconfiguration. ([#205](https://github.com/cipherstash/encrypt-query-language/issues/205))
- **`eql_v2_encrypted = eql_v2_encrypted` is now strictly hmac-based at the root.** Equality requires both sides to carry `hm` (hmac); otherwise the operator returns NULL (and the query returns zero rows). Previously, equality could silently fall through to a `NULL` comparison or to Blake3 on synthetic fixtures. **Behaviour to be aware of:** see [U-002](docs/upgrading/v2.3.md#u-002-equality-and-hashing-require-hmac). ([#196](https://github.com/cipherstash/encrypt-query-language/pull/196), [#205](https://github.com/cipherstash/encrypt-query-language/issues/205))
- **`eql_v2.hash_encrypted(eql_v2_encrypted)` is now hmac-only.** Hash operations (`GROUP BY`, `DISTINCT`, hash joins) require the column to carry an `hm` index term; the previous Blake3 fallback has been removed. The function raises a clear error directing the caller to configure a `unique` index. ([#196](https://github.com/cipherstash/encrypt-query-language/pull/196))
Expand All @@ -46,6 +47,7 @@ Targeting `2.3.0` as a breaking release. Customers re-encrypt their data as part
### Deprecated

- **Operator-class indexes (`CREATE INDEX … (col eql_v2.encrypted_operator_class)`) are discouraged for the equality / `LIKE` query path.** They will continue to function for the lifetime of `2.x` and are not slated for removal in this minor. Functional indexes (`eql_v2.hmac_256(col)`, `eql_v2.bloom_filter(col)`, `eql_v2.ste_vec(col)`) are now the canonical path because they (a) work on Supabase and managed Postgres without superuser, (b) avoid the btree row-size limit (`index row size N exceeds btree version 4 maximum 2704`) that opclass indexes hit on full-payload encryption, and (c) give the planner a structurally matchable extractor. The narrow exception is `ORDER BY` over Block ORE columns, where a custom comparator is strictly required — keep opclass indexes on those columns. See [U-001](docs/upgrading/v2.3.md#u-001-functional-indexes-as-the-canonical-recipe).
- **`eql_v2.lt`, `eql_v2.lte`, `eql_v2.gt`, `eql_v2.gte` are deprecated and slated for removal in EQL 3.0.** These plpgsql helpers used to back the `<` / `<=` / `>` / `>=` operators; after the range-operator inlining ([U-005](docs/upgrading/v2.3.md#u-005-range-operators-are-block-ore-only)) the operators bypass them entirely and inline an `ore_block_u64_8_256` comparison directly. The helpers still walk `eql_v2.compare`'s priority list (ore_block → ore_cllw_u64 → ore_cllw_var → ope), so on `ore_cllw_*` / OPE-only columns they will return a Boolean where the matching operator now raises — same name, divergent contract. Callers invoking them directly should switch to the operator form for `ore` columns, or to the relevant extractor form (e.g. `eql_v2.ore_cllw_u64_8(col) < eql_v2.ore_cllw_u64_8($1::jsonb)`) for `ore_cllw_*` / OPE columns. ([#211](https://github.com/cipherstash/encrypt-query-language/pull/211))

### Upgrade notes

Expand Down
30 changes: 26 additions & 4 deletions docs/reference/database-indexes.md
Original file line number Diff line number Diff line change
Expand Up @@ -149,15 +149,37 @@ Bitmap Heap Scan on users

### Range Queries

When encrypted column has `ob` (ore_block_u64_8_256), `opf` (ope_cllw_u64_65), or `opv` (ope_cllw_var_8) index terms:
The canonical 2.3 recipe is a functional B-tree index over the `ob` (Block ORE) term:

```sql
CREATE INDEX events_encrypted_date_ore_idx
ON events (eql_v2.ore_block_u64_8_256(encrypted_date));
ANALYZE events;
```

The `eql_v2.ore_block_u64_8_256_operator_class` is `DEFAULT FOR TYPE`, so it's selected automatically — no explicit opclass annotation needed. The `<`, `<=`, `>`, `>=` operators on `eql_v2_encrypted` inline to `eql_v2.ore_block_u64_8_256(a) <op> eql_v2.ore_block_u64_8_256(b)`, which means natural-form range queries match the index without any rewriting:

```sql
SELECT * FROM events
WHERE encrypted_date < $1::eql_v2_encrypted
ORDER BY encrypted_date DESC;
WHERE encrypted_date < $1::eql_v2_encrypted
ORDER BY encrypted_date DESC
LIMIT 10;
```

The encrypted operator class transparently dispatches to whichever ordered term is present on the column, so range queries against an `ore`-configured column and an `ope`-configured column have identical SQL.
**Index Scan vs. Top-N sort.** PostgreSQL uses the functional ORE index for the `WHERE` clause via structural match on the inlined predicate. The `ORDER BY` step, however, still needs a Sort node when the sort key is `encrypted_date` (the natural form) — Postgres only uses an index for `ORDER BY` when the sort key syntactically matches the index expression. With the operator inlining, each comparison in that Sort step now reduces to an inlined ORE-term comparison, so a `LIMIT n` Top-N sort is fast even without an index-ordered scan.

To skip the Sort step entirely, write the `ORDER BY` in extractor form:

```sql
SELECT * FROM events
WHERE encrypted_date < $1::eql_v2_encrypted
ORDER BY eql_v2.ore_block_u64_8_256(encrypted_date) DESC
LIMIT 10;
```

The sort key now matches the functional index expression, so the planner streams rows out of the index in order — a plain Index Scan, no separate Sort node.

**Non-Block-ORE term types.** For columns carrying only `ore_cllw_u64_8`, `ore_cllw_var_8`, `opf` (OPE fixed-width), or `opv` (OPE variable-width) terms, the bare-form `<` / `>` operators no longer dispatch through `eql_v2.compare()` — they go straight to the Block ORE extractor, which raises on a missing `ob`. Either migrate the column configuration to `ore` (Block ORE), or rewrite range queries to the matching extractor form, e.g. `WHERE eql_v2.ore_cllw_u64_8(col) < eql_v2.ore_cllw_u64_8($1::jsonb)`. See [U-005](../upgrading/v2.3.md#u-005-range-operators-are-block-ore-only) for the migration notes.

### GROUP BY

Expand Down
21 changes: 21 additions & 0 deletions docs/upgrading/v2.3.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ The `eql_v2` schema name, top-level type names, operator names, and the root-lev
2. **Equality and hashing now require `hm` (hmac) on the column** ([U-002](#u-002-equality-and-hashing-require-hmac)). Previously, equality could silently fall through to a `NULL` comparison or to Blake3 on synthetic fixtures. Now it raises with a clear message.
3. **Root-level `b3` (Blake3) is no longer consulted** ([U-003](#u-003-blake3-removed-at-root)). It was never emitted by `@cipherstash/protect` in production — only by test fixtures.
4. **ste_vec element equality term migrated from `b3` to `hm`; the entire `eql_v2.blake3` family is removed** ([U-004](#u-004-sv-element-equality-term-is-hm-not-b3)). A new `eql_v2.hmac_256(val, selector)` overload provides the canonical field-level equality extractor.
5. **Range operators (`<`, `<=`, `>`, `>=`) require `ob` (Block ORE) on the column** ([U-005](#u-005-range-operators-are-block-ore-only)). The operators are now inlinable SQL whose body extracts the `ob` term on both sides, so bare-form range queries engage a functional ORE index. Columns carrying only CLLW (`ore_cllw_u64_8` / `ore_cllw_var_8`) or OPE (`opf` / `opv`) terms must rewrite to the matching extractor form.

## Compatibility

Expand Down Expand Up @@ -158,6 +159,26 @@ Test the migration on a staging copy before promoting. Cover both axes: run a ha

**Versioning note.** Per `CLAUDE.md`, payload-format changes typically warrant a major-version bump. The team has opted to land this as `2.3.0` because the public API surface (`eql_v2` schema name, top-level type names, operator names) is preserved — only the internal ste_vec payload shape and the Blake3 helpers change. Future readers comparing 2.3.0 against the project's versioning ladder should treat this as the documented exception.

### U-005: Range operators are Block ORE only

**What changed.** `<`, `<=`, `>`, `>=` on `eql_v2_encrypted` are now inlinable SQL functions whose bodies reduce to a direct comparison on the `ob` (Block ORE) term: `eql_v2.ore_block_u64_8_256(a) <op> eql_v2.ore_block_u64_8_256(b)`. The planner can structurally match `WHERE col < $1` against a functional btree index on `eql_v2.ore_block_u64_8_256(col)` (using the existing `eql_v2.ore_block_u64_8_256_operator_class`, which is `DEFAULT FOR TYPE`), so range queries engage that index without per-query rewriting.

The old plpgsql wrappers walked `eql_v2.compare()`'s priority list (Block ORE → CLLW u64 → CLLW var → OPE → hmac → literal fallback). After this change, `<` / `<=` / `>` / `>=` no longer consult that priority list — they go straight to the `ob` extractor, which raises with `Expected an ore index (ob) value in json: …` if the column doesn't carry one.

**Why.** Same precedent as U-002 for equality: make the canonical functional index match through bare-form predicates without the operator-class detour. Block ORE is the standard range encoding emitted by the crypto layer for numeric / timestamp / integer columns, so this matches the default configuration most callers run. The narrower contract is also what lets `ORDER BY col LIMIT n` get a fast Top-N sort: with the inlined operator, each comparison reduces to an inlined ORE-term comparison rather than a full plpgsql `compare()` dispatch.

**Action required.**

- **Columns configured with `ore` (default for numerics / timestamps / integers)** — no change. They carry `ob`, so the natural form is now both index-friendly and fast.
- **Columns configured with `ore_cllw_u64_8` / `ore_cllw_var_8` (CLLW ORE), or OPE-only (`opf` / `opv`)** — bare-form range queries on these columns will now raise. Two paths forward:
- **Preferred:** migrate the column configuration to `ore` (Block ORE) so the natural form works everywhere.
- **If you must keep the existing encoding:** rewrite range queries to the matching extractor form. For CLLW u64: `WHERE eql_v2.ore_cllw_u64_8(col) < eql_v2.ore_cllw_u64_8($1::jsonb)`. For CLLW var: substitute `ore_cllw_var_8`. For OPE: substitute the corresponding `ope_cllw_*` extractor. These extractor-form predicates engage their own functional indexes on the same expression.
- **Selector-extracted comparisons** (`WHERE e->'selector' < $1`) — same rule applies recursively. If the extracted sub-payload doesn't carry `ob`, rewrite to the extractor matching the sub-payload's term type.

**Notes for ORDER BY.** Bare-form `ORDER BY col` still needs a Sort node — the natural-form sort key doesn't syntactically match a functional ORE index expression — but the residual Sort step is now fast because each comparison uses the inlined ORE-term path. To skip the Sort node entirely, write the ORDER BY in extractor form: `ORDER BY eql_v2.ore_block_u64_8_256(col)`. That sort key matches the functional index and lets PostgreSQL stream rows out of the index in order.

**Wider coverage is deferred.** A CASE-style operator body that re-introduces CLLW / OPE support under the same inlined operators is being considered for a future release. Until then, those term types route through the extractor form.

## Verification checklist

- [ ] **`EXPLAIN ANALYZE` on representative queries** — equality, `LIKE`, jsonb path. Each plan should contain `Index Scan using <your_index>` rather than `Seq Scan`. Functional indexes engage automatically post-2.3 via the inlined operators.
Expand Down
Loading