feat(scalars): add eql_v3.text encrypted-domain family (eq / match / ord)#260
Open
tobyhede wants to merge 21 commits into
Open
feat(scalars): add eql_v3.text encrypted-domain family (eq / match / ord)#260tobyhede wants to merge 21 commits into
tobyhede wants to merge 21 commits into
Conversation
…ce fixtures + text discriminator)
…py->Clone, to_sql_literal &Self, hand-written ScalarType for String, register text)
…ventory discovery clean)
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…t wired-kinds comments Addresses CodeRabbit review on #260: - Fixture::Zero now resolves to None for non-integer kinds, matching Min/Max and the function's documented contract (test-guaranteed never hit, but consistent). - cast_for_kind/plaintext_sql_type_for_kind doc comments now list Text as wired.
…ven test, match coverage - bloom_filter(jsonb): LANGUAGE sql (inlinable), drop the redundant RAISE. The match capability is tied to the text_match domain CHECK (which guarantees bf), so a missing key can only occur on raw jsonb — return NULL there, mirroring hmac_256. Add the pin_search_path inline-critical clause + splinter allowlist row so it stays unpinned/inlinable. has_bloom_filter unchanged (matches has_hmac_256). SEM test now asserts NULL instead of a raise. - context.rs: table-driven operator-metadata test — a new term's metadata is one table row, not another hand-rolled assertion block. - eql-tests-macros: document the is_temporal -> is_hand_written rename (the predicate gates "diverges from the generated integer path", not "is a date"). - text_match: add a bare-operator (`col @> needle`) GIN index-engagement test; document the probabilistic / ngram-disjoint basis of the disjoint assertion. - text_smoke: add empty-bloom set-semantics test (everything contains the empty filter; the empty filter contains nothing). - rename string_to_plaintext_is_utf8 -> string_to_plaintext_is_text.
…x text "" pivot (#262) The scalar matrix's third pivot was hardwired to `Default::default()` — `0` for int, the epoch for date, but `""` for text, which encrypts to an empty ORE term and broke ordering/aggregates/counts (83 CI failures on #260). Introduce the real taxonomy as traits: - ScalarType (base) — identity, fixtures, literal rendering. - OrderedScalar: ScalarType — min_pivot/max_pivot + an overridable interior mid_pivot (default Self::default()). int/date inherit (0/epoch); text overrides to a real median ("frank"), never the degenerate "". - SignedScalar: OrderedScalar — origin() (numeric zero / sign boundary). int and date only; text is NOT SignedScalar (lexicographic order has no origin). The pivot SWEEP stays uniform (min/mid/max) across every ordered type, so the single canonical matrix snapshot is preserved — only a `_pivot_zero_` -> `_pivot_mid_` rename. The signed-only sign-boundary test (asserts ORE ordering is monotonic across the origin) is generic over `SignedScalar` and lives outside the `scalars::` namespace (like text_match), so a `text` instantiation is a compile error and it never enters the inventory snapshot — no per-capability snapshot, inventory, or macro branching. Also: drop "" from TEXT_FIXTURES (text has no numeric origin — #262); the proc macro emits OrderedScalar+SignedScalar for the generated integer impls; harden text_match::match_uses_functional_index to force enable_seqscan=off. Verified: 640 text/int4/date/signed/text_match tests pass (prior 83 "" failures gone); matrix inventory matches the regenerated snapshot (5 types, no signed leak); codegen:parity unchanged. Empty-string behavioural decision tracked in #262.
…(STRICT) The extractor is declared STRICT, so PostgreSQL returns NULL for a NULL argument without entering the body — the explicit guard was dead code. Matches the inlined bloom_filter extractor; behaviour unchanged (verified: ore_block(NULL::jsonb) IS NULL, missing-ob still RAISEs).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds the
eql_v3.textscalar encrypted-domain family at parity with EQL v2 encrypted text: equality (HMAC), match (a new self-containedeql_v3.bloom_filterSEM index term), and ORE ordering. First scalar to add a new indexTerm(Bloom) and the first non-integer, unbounded ordered kind.Supported text domains
eql_v3.texteql_v3.text_eq=<>hm)eql_v3.text_match@><@bf)eql_v3.text_ord=<><<=>>=,min/maxob)eql_v3.text_ord_oretext_ordob)A real encrypted text payload carries
hm+bf+ob; callers cast per predicate. Match is bloom-filter containment ontext_match— deliberately not SQLLIKE— and never backs equality (that always routes throughHm).Notes
@>/<@flip from blocker → inlinable wrappers only onBloomdomains, so the int4 golden is byte-identical (codegen:paritygreen).eql_v3.bloom_filterSEM type is self-contained (noeql_v2dependency;test:self_contained_v3green).Copy(String) plaintext (Copy→Clone,to_sql_literal(&Self));[text]harness marker mirrors[temporal].Stacked on
eql_v3.