Empty-string ("") handling for ordered encrypted text is undefined / inconsistent between eql_v2 and eql_v3

## Summary

Encrypting the **empty string `""`** as ordered encrypted text produces an **empty ORE term (`ob: []`)** (and empty bloom `bf: []`), and the comparison behaviour of an empty ORE term is **undefined / inconsistent across EQL versions**. This needs a deliberate decision: is `""` a supported plaintext for *ordered* encrypted text, or is it out of scope?

Surfaced while adding `eql_v3.text` (#260), where `""` was used as the SQLx matrix "zero" pivot and broke ordering, aggregates, and comparison counts.

## Findings

**EQL v2 (`main`) — no coverage, but defensive handling exists**
- v2 has **zero test coverage of `""`** for encrypted text. The ORE-text fixture (`tests/sqlx/migrations/006_install_ore_text_data.sql`) is 100 real words; the smallest is `'aardvark'`. No fixture ever has an empty `ob`.
- v2 *does* defensively handle empty term arrays: `eql_v2.compare_ore_block_u64_8_256_terms` documents "empty arrays sort before non-empty arrays" and returns `-1` for empty-vs-non-empty. **This path is never exercised by any test.**

**EQL v3 (`eql_v3.text`, #260) — diverges from v2**
- The v3 SEM ORE fork does **not** reproduce v2's empty-array guard. With `""` in the fixtures, empty `ob` orders as the **maximum**, not the minimum:
  - `eql_v3.max(eql_v3.text_ord)` returns the `""` payload instead of the real max (`"zzzz"`).
  - `payload::eql_v3.text_ord > ''` returns **0 rows** (expected: all non-empty values).
  - `'zzzz' > payload` counts are off by one (the `""` row is silently dropped).
  - `count_distinct` over `ord_term` hits `function … returned NULL` on the empty term.

So v2 says "empty sorts first" (untested), v3 effectively sorts it last/inconsistently — neither is validated end-to-end.

## Decision needed

1. **Is `""` (and other degenerate/too-short-to-tokenize plaintext) a supported value for *ordered* encrypted text?**
   - **If yes:** the v3 SEM ORE comparison must define and implement empty-term ordering (mirror v2's "empty sorts first"), with explicit fixtures/tests covering it across `_eq` / `_ord` / `_ord_ore` and `min`/`max`. The match (`bf: []`) empty-set semantics should also be pinned (everything contains the empty filter; the empty filter contains nothing).
   - **If no:** document the constraint (minimum/at-least-one-ngram plaintext), and decide where it's enforced (proxy / client / EQL).

2. **Reconcile the v2↔v3 ORE empty-array divergence** regardless of (1), so the two schemas don't disagree on a payload either might receive.

## Immediate mitigation (in #260)

PR #260 will **drop `""` from the `eql_v3.text` fixtures** and use real non-empty values (mirroring v2's "real word" convention, smallest a short real token), plus replace the matrix's `Default::default()` zero-pivot with an overridable `ScalarType::zero_pivot()` so `text` supplies a real mid value. That unblocks the PR; **this issue tracks the underlying behavioural decision and the v2/v3 divergence**, which outlive #260.

## References
- PR #260 (eql_v3.text encrypted-domain family)
- `src/ore_block_u64_8_256/functions.sql` — `eql_v2.compare_ore_block_u64_8_256_terms` (empty-array handling)
- `src/v3/sem/ore_block_u64_8_256/` — the v3 SEM fork
- `tests/sqlx/migrations/006_install_ore_text_data.sql` — v2 ORE-text fixtures (smallest = `aardvark`)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Empty-string ("") handling for ordered encrypted text is undefined / inconsistent between eql_v2 and eql_v3 #262

Summary

Findings

Decision needed

Immediate mitigation (in #260)

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Empty-string ("") handling for ordered encrypted text is undefined / inconsistent between eql_v2 and eql_v3 #262

Description

Summary

Findings

Decision needed

Immediate mitigation (in #260)

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions