Refactor code generator into Rust#251
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/test-eql.yml (1)
6-20:⚠️ Potential issue | 🟠 Major | ⚡ Quick winInclude
crates/**intest-eql.ymltriggerpathsso Rust generator-only PRs run the parity gate.
.github/workflows/test-eql.ymlis the only workflow containingmise run codegen:parity, and repo-wide workflow searches show nocratespath filters—so changes undercrates/eql-codegenwon’t trigger this workflow.Proposed change
paths: - ".github/workflows/test-eql.yml" - "src/**/*.sql" - "sql/**/*.sql" - "tests/**/*" - "tasks/**/*" + - "crates/**"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/test-eql.yml around lines 6 - 20, The workflow test-eql.yml doesn't include the crates path so changes in crates/eql-codegen won't trigger the parity gate; update the paths arrays in the test-eql.yml manifest (both the root push paths and the pull_request.paths block) to include "crates/**" so that files under the crates directory (e.g., crates/eql-codegen) will cause mise run codegen:parity to run.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/eql-codegen/tests/parity.rs`:
- Around line 100-134: The test rust_and_python_generators_are_byte_identical
currently performs no assertion and can mutate the repo by invoking the Python
generator; change it so it doesn't run by default and can't dirty the working
tree: add the #[ignore] attribute to the test fn
rust_and_python_generators_are_byte_identical (or remove the test entirely) so
cargo test won't run it by default, or alternatively make it a real test by
removing the early break/let _ and implementing a non-mutating assertion (e.g.,
run Python against a tempdir and assert parity) — reference the test function
name rust_and_python_generators_are_byte_identical and the Python invocation
block that shells out to "python3" for locating the code to change.
---
Outside diff comments:
In @.github/workflows/test-eql.yml:
- Around line 6-20: The workflow test-eql.yml doesn't include the crates path so
changes in crates/eql-codegen won't trigger the parity gate; update the paths
arrays in the test-eql.yml manifest (both the root push paths and the
pull_request.paths block) to include "crates/**" so that files under the crates
directory (e.g., crates/eql-codegen) will cause mise run codegen:parity to run.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: dd504cd2-5ed5-42fd-890c-41908a5d8cfa
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (12)
.github/workflows/test-eql.ymlcrates/eql-codegen/Cargo.tomlcrates/eql-codegen/src/consts.rscrates/eql-codegen/src/generate.rscrates/eql-codegen/src/lib.rscrates/eql-codegen/src/main.rscrates/eql-codegen/src/operator_surface.rscrates/eql-codegen/src/templates.rscrates/eql-codegen/src/writer.rscrates/eql-codegen/tests/parity.rsmise.tomltasks/codegen-parity.sh
74c1824 to
77312fb
Compare
f482823 to
7ba43a6
Compare
Port the scalar encrypted-domain SQL generator from the Python oracle to the `eql-codegen` Rust crate, achieving generator parity and wiring it into the build/test/CI flow. Generator: - Scaffold the eql-codegen crate: AUTO-GENERATED headers, schema consts, sql_str, operator-surface table, backing_function, role/brief/shape helpers, and the fixture-values renderer. - Port the domain-block, extractor, wrapper, three blocker renderers (LANGUAGE plpgsql, never STRICT), operator, aggregate (ord-capability), and types-file renderers; relocate derivation helpers + logic tests into the context layer. - Render the int4 SQL surface (types/functions/operators/aggregates) from minijinja templates; add the minijinja+serde template environment. - Port the ownership-guarded writer (std-only) and the generate_all orchestrator + CLI (generate / list-types). Tests & CI: - int4 golden-master reference tests; values.rs byte-exact, integration golden line-normalized; coarsened footgun + escaping guards as whole-file/context scans. - Retire the Python-oracle parity gate; codegen:parity now gates on golden + values. Drop the deprecated Python test:codegen drift check. Cleanup: - Standardize the generated-SQL marker on `-- AUTOMATICALLY GENERATED FILE` (the project-wide marker docs:validate greps on); regenerate int2/int4 values.rs fixtures and update the consts.rs assertion and CLAUDE.md. - Address review findings (operator-check rename, placeholder_payload comment, exit-code u8 clamp).
77312fb to
2737d5f
Compare
…ates The Rust port (2737d5f) re-headed every committed/golden artifact to the `AUTOMATICALLY GENERATED FILE.` marker and updated consts.rs + CLAUDE.md, but left the Python generator on its original `AUTO-GENERATED — DO NOT EDIT.` marker. CI still runs the Python generator (codegen:domain:all, build), so its ownership check mismatched the Rust-marked files and refused to overwrite them as hand-written (OwnershipError), failing the "Encrypted-domain codegen" job. Align tasks/codegen/templates.py (and its marker tests) to the canonical single-line Rust marker — the one docs:validate greps on and consts.rs asserts. Also fix the "Rust workspace crates" job: cargo fmt the eql-codegen sources (never formatted) and resolve two latent clippy manual_contains lints in generate.rs that fmt's fail-fast had masked. Verified locally: codegen:domain:all (+ git diff), codegen:parity, and test:crates (fmt --check + clippy -D warnings + cargo test) all pass.
- Point generated values.rs header at eql-scalars::CATALOG, not deleted TOML - Cover int2 in scalar_family_inlinable_operators_are_clean lint test - codegen-parity.sh: catch untracked generated values.rs via git status - main.rs: make non-zero Ok exit-code arm an explicit FAILURE - Fix stale fixture.rs breadcrumb in cipherstash.rs to int4.rs - FIXTURE_SCHEMA.md: int4 fixture is now 17 rows incl signed extremes + zero
…alog Replace the generic `Kind` enum (Symmetric/Path/BlockerOnly) and the parallel `SYMMETRIC_OPERATORS`/`PATH_OPERATORS`/`BLOCKER_ONLY_OPERATORS` slices + per-shape helpers with a single `OPERATORS` catalog where each operator carries strongly-typed PostgreSQL signatures and pure-data planner metadata. - operator_surface.rs: add `TypeSlot`, `OperatorSignature`, `RenderedSignature`, and `OperatorMetadata`; each `Operator` now owns its `signatures` and `metadata`. Rename `backing` -> `function_name` and `backing_function` -> `operator_function_name`. Delete `Kind` and the three category slices. - generate.rs: `render_functions_file`/`render_operators_file` iterate the catalog x signatures and choose wrapper vs unsupported-operator entries per domain from `Term::operators_for_terms`. Remove the category loops and `*_shapes` helpers; preserve the `selector` param name for `->`/`->>` via a small `arg_b_name` helper. - context.rs: `operator_entry` takes `&Operator` and renders metadata only when the domain supports the operator; rename `FnEntry::Blocker` -> `Unsupported` and `blocker_entry` -> `unsupported_entry`. Generated SQL is byte-identical (codegen:parity passes); blocking is now a per-domain emission decision rather than a static operator category. Adds direct unit tests for metadata gating, the selector naming, the jsonpath/text[] slots, and a non-empty-signatures catalog invariant. Spec: docs/development/operator-surface-generation-spec.md
The committed `<T>_values.rs` files carry the Rust generator's header
("Generated from the `<T>` row in `eql-scalars::CATALOG`"), but the
Python generator still emitted the older "tasks/codegen/types/<T>.toml"
wording. CI's "regenerate fixture-value consts" step runs the Python
generator (`codegen:domain:all`) and diffs the result, so the header
mismatch failed the build.
Re-align `templates.py` to the Rust header (the recurrence of the drift
fixed in e186464) and update the now-stale assertion in
`test_templates.py`. Verified: `codegen:domain:all` leaves
tests/sqlx/src/fixtures clean and `codegen:parity` still passes.
daf1709 to
2c08b8e
Compare
Replace the if/elif/else body in functions.sql.j2 with a dynamic
{% include %} driven off each entry's serde "kind" tag
(Extractor/Wrapper/Unsupported -> functions/<kind>.sql.j2). The parent
template keeps the once-per-file header, -- REQUIRE: loop, and entry loop;
each per-kind body lives in its own partial under templates/functions/.
Dispatch stays in the template layer off the existing #[serde(tag="kind")]
value, so FnEntry, generate.rs, and the Rust render path are unchanged.
Generated SQL is line-normalized-identical (golden test + codegen:parity).
Remove VERSION_KEY and ENVELOPE_VERSION — dead leftovers from the Python port never wired into the Rust generator (the version-pin clause is a literal 'v'='2' in types.sql.j2). Fold the always-present CIPHERTEXT_KEY into ENVELOPE_KEYS as ["v","i","c"]; context::domain_block keeps the same v,i,c ordering ahead of term keys, so generated SQL is byte-identical. Demote all consts.rs items from pub to pub(crate). Nothing outside the crate imports eql_codegen::consts, so this is safe and lets the built-in dead_code lint see them — the existing `clippy -- -D warnings` CI step (mise run test:crates) now fails on unused internal items, no workflow change needed. Verified: clippy -D warnings clean, 63 lib + 3 parity tests pass, codegen:parity OK (generated SQL unchanged).
Replace the \n\-continuation format string in render_fixture_values_rs with a flush-left raw string literal — same emitted bytes, far more legible. Kept as format! (not a minijinja template) deliberately: <T>_values.rs is validated byte-exact, unlike the line-normalized SQL surface, and the type-aware literal rendering stays in Rust regardless. Verified byte-identical via codegen:parity (rust_generator_matches_ committed_values_rs); fmt/clippy -D warnings clean, all tests pass.
The `eql_v2_int4`/`eql_v2_int2` fixtures are plain jsonb-payload test tables; the encrypted-domain types they exercise are correctly under eql_v3 (eql_v3.int4_eq/_ord), derived from the scalar type and applied via per-query cast. Reword matrix.rs's misleading "EQL domain type name" comment on `eql_type` to say fixture/table name, and point the fixtures mod.rs value-const docs at eql-scalars::CATALOG instead of the retired tasks/codegen Python path. Comment-only; no behaviour change.
…ixtures The domain_over_domain/domain_opclass lint-catalog header described the target as an eql_v2_* domain, but the checks match both eql_v3.* and legacy public.eql_v2_*. The int2/int4 fixture doc-comments called the fixture struct names a 'domain'; the actual domains are eql_v3.int2 / eql_v3.int4 (see 88ad559). Comment-only; no behaviour change.
|
|
||
| --! @file encrypted_domain/int4/int4_eq_operators.sql | ||
| --! @brief Equality-only domain of the int4 encrypted-domain family — operator declarations. | ||
| --! @brief Operators for eql_v3.int4_eq. |
There was a problem hiding this comment.
Majority of the reference changes are tightening of comments - reduce the surface area of required variables.
| --! @return boolean | ||
| CREATE FUNCTION eql_v3.lt(a eql_v3.int4_eq, b eql_v3.int4_eq) | ||
| RETURNS boolean IMMUTABLE PARALLEL SAFE | ||
| AS $$ BEGIN RETURN eql_v3.encrypted_domain_unsupported_bool('eql_v3.int4_eq', '<'); END; $$ |
There was a problem hiding this comment.
This is the only SQL code change - removing a level of indirection.
| --! @file encrypted_domain/{{ token }}/{{ name }}_functions.sql | ||
| --! @brief Functions for {{ dom }}. | ||
| {% for e in entries %} | ||
| {% include "functions/" ~ e.kind|lower ~ ".sql.j2" -%} |
There was a problem hiding this comment.
Assured that this is the canonical jinja approach.
Is fine for now.
Summary
Plan 2 of 3 of the Rust codegen migration. Adds
crates/eql-codegen— a std-only Rust binary that renders the scalar encrypted-domain SQL surface and the committed<T>_values.rsconsts from theeql-scalarscatalog (Plan 1). Stacked oneql_v3.What's here
crates/eql-codegen):consts,operator_surface,templates/generate, an ownership-guardedwriter, and amainCLI.CREATE OPERATORmetadata. No genericKindenum; Wrapper vs Unsupported is chosen per-domain fromTerm::operators_for_terms.functions.sql.j2dispatches tofunctions/{extractor,wrapper,unsupported}.sql.j2via dynamic{% include %}off the entrykind.tests/parity.rs— byte-exact<T>_values.rs+ line-normalized int4 golden master — andmise run codegen:parity, enforced in CI.Verification
cargo test -p eql-codegen— 60 lib + 3 parity green; clippy-D warningsandfmtclean.mise run codegen:parity→ PARITY OK (generated SQL + committed<T>_values.rsunchanged).🤖 Generated with Claude Code