Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ Targeting `2.3.0` as a breaking release. Customers re-encrypt their data as part

### Fixed

- **`ANALYZE` (and autovacuum) no longer raises on equality-only `eql_v2_encrypted` columns; `GROUP BY` / `DISTINCT` on a bare encrypted column now deduplicate correctly.** `eql_v2_encrypted` is a composite type, and `eql_v2.encrypted_operator_class` is its `DEFAULT` btree opclass — so PostgreSQL invokes the opclass's FUNCTION 1 comparator to gather column statistics during `ANALYZE`, and may use it for sort-based `GROUP BY` / `DISTINCT`. FUNCTION 1 was `eql_v2.compare`, which #211 made strict (it raises without a Block-ORE `ob` term — see U-005): `ANALYZE` raised `feature_not_supported` on every `hm`-only column, and since autovacuum runs `ANALYZE` routinely this surfaced constantly. The btree opclass now has its own FUNCTION 1 — `eql_v2.encrypted_btree_compare`, a total, non-raising 3-way comparator (Block ORE when present, else a total order on the `hm` term, else a deterministic payload tie-break). `eql_v2.compare` stays strict for the `<` / `>` range-operator path. Without a non-raising opclass comparator the planner fell through to PostgreSQL's built-in record comparison, which compares raw ciphertext — so two encryptions of the same plaintext (same `hm`, different `c`) failed to group; the dedicated comparator fixes that too. ([#227](https://github.com/cipherstash/encrypt-query-language/pull/227))
- **Range operators on `eql_v2_encrypted` now declare the correct planner selectivity functions.** `<=`, `>`, and `>=` (all three type overloads each) previously declared `RESTRICT = scalarltsel, JOIN = scalarltjoinsel` — the "less-than" estimators — which fed the planner inaccurate row-count estimates for the affected predicates. The inner `eql_v2.ore_block_u64_8_256` `>=` operator had a related miss (`scalarlesel` where `scalargesel` belongs). Now `<=` uses `scalarlesel`, `>` uses `scalargtsel`, and `>=` uses `scalargesel` (matching `*joinsel` variants for the JOIN selector). No query result changes — only plan choice for range queries against Block ORE columns, which becomes load-bearing now that bare-form range predicates structurally match a functional ORE index ([#211](https://github.com/cipherstash/encrypt-query-language/pull/211)). ([#216](https://github.com/cipherstash/encrypt-query-language/issues/216))

### Upgrade notes
Expand Down
129 changes: 91 additions & 38 deletions src/operators/operator_class.sql
Original file line number Diff line number Diff line change
Expand Up @@ -2,27 +2,106 @@
-- REQUIRE: src/encrypted/types.sql
-- REQUIRE: src/encrypted/functions.sql
-- REQUIRE: src/encrypted/compare.sql
-- REQUIRE: src/hmac_256/functions.sql
-- REQUIRE: src/ore_block_u64_8_256/functions.sql
-- REQUIRE: src/ore_block_u64_8_256/compare.sql
-- REQUIRE: src/operators/<.sql
-- REQUIRE: src/operators/<=.sql
-- REQUIRE: src/operators/=.sql
-- REQUIRE: src/operators/>=.sql
-- REQUIRE: src/operators/>.sql

--! @brief PostgreSQL operator class definitions for encrypted value indexing
--! @file src/operators/operator_class.sql
--! @brief Btree operator class for the `eql_v2_encrypted` composite type
--!
--! Defines the operator family and operator class required for btree indexing
--! of encrypted values. This enables PostgreSQL to use encrypted columns in:
--! - CREATE INDEX statements
--! - ORDER BY clauses
--! - Range queries
--! - Primary key constraints
--! `eql_v2_encrypted` is a composite type. PostgreSQL gives every composite
--! type an implicit row-wise btree comparison (`record_ops`) — but that
--! compares the raw ciphertext byte-for-byte, so two encryptions of the same
--! plaintext (same `hm`, different `c`) would sort and group as *distinct*.
--! `eql_v2.encrypted_operator_class` is registered `DEFAULT ... USING btree`
--! specifically to override `record_ops` with a comparison that is correct
--! for encrypted data: `GROUP BY`, `DISTINCT`, `ORDER BY`, sort-merge joins
--! and `ANALYZE` on a bare `eql_v2_encrypted` column all route through
--! FUNCTION 1 below.
--!
--! The operator class maps the five comparison operators (<, <=, =, >=, >)
--! to the eql_v2.compare() support function for btree index operations.
--! @note FUNCTION 1 is `eql_v2.encrypted_btree_compare`, NOT the strict
--! `eql_v2.compare`. A btree support function must be total and must
--! never raise — `ANALYZE` calls it to build column statistics on
--! every encrypted column. `eql_v2.compare` is deliberately strict
--! (it raises without a Block-ORE `ob` term — see U-005); it backs
--! the `<` / `>` range operators, not this opclass.
--!
--! @note This is the default operator class for eql_v2_encrypted type
--! @note Functional indexes are the canonical recipe for *building* indexes
--! on encrypted columns (see U-001 and docs/reference/database-indexes.md).
--! This opclass exists to keep the composite type's built-in
--! comparison correct — not as an index-building recommendation.
--!
--! @see eql_v2.encrypted_hash_operator_class (hash — GROUP BY / hash joins)
--! @see eql_v2.compare
--! @see PostgreSQL documentation on operator classes

--------------------

--! @brief Total, non-raising btree comparator for `eql_v2_encrypted`
--!
--! Three-way comparison (`-1` / `0` / `1`) used as FUNCTION 1 of
--! `eql_v2.encrypted_operator_class`. Unlike `eql_v2.compare`, it never
--! raises: a btree support function is invoked by `ANALYZE`, sort, and
--! `GROUP BY` on every value, so raising is not an option.
--!
--! Comparison priority:
--! 1. Both operands carry `ob` (Block ORE) — order-preserving comparison
--! via `eql_v2.compare_ore_block_u64_8_256`.
--! 2. Both operands carry `hm` (HMAC-256) — a total order on the hmac
--! bytes. Not order-preserving on plaintext (hmac is not), but
--! deterministic, total, and `= 0` exactly when the hmac terms match
--! — consistent with the `=` operator, so `GROUP BY` / `DISTINCT`
--! deduplicate correctly.
--! 3. Otherwise — a deterministic order on the raw payload. Reached only
--! for term-less / mixed payloads; present so the function stays total.
--!
--! @param a eql_v2_encrypted First value
--! @param b eql_v2_encrypted Second value
--! @return integer -1, 0, or 1
--!
--! @internal
--! @see eql_v2.encrypted_operator_class
--! @see eql_v2.compare
CREATE FUNCTION eql_v2.encrypted_btree_compare(a eql_v2_encrypted, b eql_v2_encrypted)
RETURNS integer
IMMUTABLE STRICT PARALLEL SAFE
SET search_path = pg_catalog, extensions, public
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderdan I assume the SET search_path = ... is OK here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm not 100% sure. Let me double check.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes its fine. This function is pl/pgsql so its not inlineable anyway!

AS $$
DECLARE
hm_a text;
hm_b text;
BEGIN
-- Block ORE on both sides: order-preserving comparison.
IF eql_v2.has_ore_block_u64_8_256(a) AND eql_v2.has_ore_block_u64_8_256(b) THEN
RETURN eql_v2.compare_ore_block_u64_8_256(a, b);
END IF;

-- HMAC on both sides: total order on the hmac bytes. `= 0` iff the hmac
-- terms match, consistent with the `=` operator and the hash opclass.
hm_a := eql_v2.hmac_256(a)::text;
hm_b := eql_v2.hmac_256(b)::text;
IF hm_a IS NOT NULL AND hm_b IS NOT NULL THEN
RETURN CASE
WHEN hm_a < hm_b THEN -1
WHEN hm_a > hm_b THEN 1
ELSE 0
END;
END IF;

-- Fallback for term-less / mixed payloads: a deterministic, non-raising
-- total order on the raw payload. Not a normal column shape — this
-- branch only keeps the btree FUNCTION 1 contract (total, never raises).
RETURN CASE
WHEN (a).data::text < (b).data::text THEN -1
WHEN (a).data::text > (b).data::text THEN 1
ELSE 0
END;
END;
$$ LANGUAGE plpgsql;

--------------------

Expand All @@ -34,30 +113,4 @@ CREATE OPERATOR CLASS eql_v2.encrypted_operator_class DEFAULT FOR TYPE eql_v2_en
OPERATOR 3 =,
OPERATOR 4 >=,
OPERATOR 5 >,
FUNCTION 1 eql_v2.compare(a eql_v2_encrypted, b eql_v2_encrypted);


--------------------

-- CREATE OPERATOR FAMILY eql_v2.encrypted_operator_ordered USING btree;

-- CREATE OPERATOR CLASS eql_v2.encrypted_operator_ordered FOR TYPE eql_v2_encrypted USING btree FAMILY eql_v2.encrypted_operator_ordered AS
-- OPERATOR 1 <,
-- OPERATOR 2 <=,
-- OPERATOR 3 =,
-- OPERATOR 4 >=,
-- OPERATOR 5 >,
-- FUNCTION 1 eql_v2.compare_ore_block_u64_8_256(a eql_v2_encrypted, b eql_v2_encrypted);

--------------------

-- CREATE OPERATOR FAMILY eql_v2.encrypted_hmac_256_operator USING btree;

-- CREATE OPERATOR CLASS eql_v2.encrypted_hmac_256_operator FOR TYPE eql_v2_encrypted USING btree FAMILY eql_v2.encrypted_hmac_256_operator AS
-- OPERATOR 1 <,
-- OPERATOR 2 <=,
-- OPERATOR 3 =,
-- OPERATOR 4 >=,
-- OPERATOR 5 >,
-- FUNCTION 1 eql_v2.compare_hmac(a eql_v2_encrypted, b eql_v2_encrypted);

FUNCTION 1 eql_v2.encrypted_btree_compare(a eql_v2_encrypted, b eql_v2_encrypted);
29 changes: 27 additions & 2 deletions tests/sqlx/tests/operator_class_tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -96,9 +96,10 @@ async fn index_usage_with_explain_analyze(pool: PgPool) -> Result<()> {
}

#[sqlx::test]
#[ignore = "Strict eql_v2.compare contract: raises on missing ORE term. This test builds a btree using eql_v2.encrypted_operator_class over hm-only payloads; the opclass calls compare() per row, which now raises. Equality on hm-only columns should use the inlined `=` operator (post-#193), not opclass-driven btree. Re-enable once the test is rewritten to use ORE-bearing payloads (or to assert raise-on-build-with-hm-only)."]
async fn index_behavior_with_different_data_types(pool: PgPool) -> Result<()> {
// Test: Index behavior with various encrypted data types (37 assertions)
// Test: Index behavior with various encrypted data types. The opclass
// btree FUNCTION 1 is eql_v2.encrypted_btree_compare (total, non-raising),
// so building the index and ANALYZE over hm-only payloads both succeed.

create_table_with_encrypted(&pool).await?;

Expand Down Expand Up @@ -221,3 +222,27 @@ async fn index_behavior_with_different_data_types(pool: PgPool) -> Result<()> {

Ok(())
}

#[sqlx::test(fixtures(path = "../fixtures", scripts("encrypted_json")))]
async fn analyze_on_hmac_only_column_does_not_raise(pool: PgPool) -> Result<()> {
// Regression: eql_v2_encrypted has a DEFAULT btree operator class, so
// ANALYZE invokes its FUNCTION 1 comparator to gather column statistics.
// That comparator must never raise — ANALYZE (autovacuum included) runs
// on every encrypted column. It previously pointed at the strict
// eql_v2.compare, which raises without a Block-ORE `ob` term, so ANALYZE
// failed on every equality-only (`hm`-only) encrypted column. FUNCTION 1
// is now eql_v2.encrypted_btree_compare (total, non-raising).

create_table_with_encrypted(&pool).await?;

for _ in 0..5 {
sqlx::query("INSERT INTO encrypted(e) VALUES (create_encrypted_json(1, 'hm'))")
.execute(&pool)
.await?;
}

// Must complete without raising.
sqlx::query("ANALYZE encrypted").execute(&pool).await?;

Ok(())
}