Skip to content

refactor: unify As through the spec table#11

Merged
hakimjonas merged 4 commits into
fix/null-safe-pipe-ops-and-skill-flagfrom
refactor/unify-as-into-spec-table
Jun 1, 2026
Merged

refactor: unify As through the spec table#11
hakimjonas merged 4 commits into
fix/null-safe-pipe-ops-and-skill-flagfrom
refactor/unify-as-into-spec-table

Conversation

@hakimjonas
Copy link
Copy Markdown
Owner

Summary

Stacked on top of #10. Fixes the architectural smell flagged in PR #10's
audit: `as` was the lone pipe op that violated "spec table is the
single source of truth," forcing every per-op invariant (null
short-circuit, completer gating, trivial-warning detection) to be
re-implemented at the `As` callsite or risk silently disagreeing
with the rest of the system.

After this PR: every pipe op (BuiltinPipeOp and the typed-argument
`As`) flows through the same `pipeOpInfoFor → spec.eval / spec.infer`
dispatch. The `As` AST class survives because its argument is the
typed `OutputFormat` enum rather than a `LamExpr`, but it carries
data only — its behaviour lives in the spec.

The three commits

1. refactor(shape): hand-build curated remediation templates, drop parser import

`check.dart` imported the parser solely to validate four constant
remediation source strings into ASTs at construction time. Hand-built
`const LamExpr` values (`{items: .}`, `{value: .}`, `to_entries`,
`{value: .} | to_entries`) get the same result without dragging the
parser into the import graph.

The cycle `pipe_ops → synthesize → check → parser → pipe_ops` is
broken. `pipe_ops.dart` can now import `synthesize.dart`.

Side effect: removes the public `Remediation()` and
`Remediation.withDisplay()` factories that took a `source` string.
Internal callers exclusively used the private `Remediation._`. The
public type and its read-only fields stay exported. Pre-1.0 breaking
change.

2. refactor: unify As through the spec table; eval takes the AST node

The architectural change.

  • `_asSpec` becomes a real spec entry: `infer` and `eval` consult
    `canWriteAs` / `synthesize` directly. No more stub.
  • `PipeOpInfo.eval` signature changes from `(ctx, args, eval)` to
    `(ctx, op, eval)`. Specs that need arguments destructure the AST
    themselves: `(op as BuiltinPipeOp).args[0]` for the generic case,
    `(op as As).target` for the typed case. Nine specs updated
    mechanically; 19 zero-arg specs use `(_, _, _)` and need no body
    change.
  • `evalBuiltinPipeOp` is renamed to `evalPipeOp` and accepts any
    pipe-op AST. Dispatches via `pipeOpInfoFor`.
  • The evaluator's switch case becomes
    `BuiltinPipeOp() || As() => evalPipeOp(...)`.
  • `infer.dart` loses its `if (expr is As)` branch and `_asShape`
    helper.
  • `evaluator.dart` loses its `As(:final target) => _as(...)` case
    and `_as` helper.

3. docs: update ast.dart and pipe_ops.dart comments to match unified dispatch

Comment-only. The `As` class doc still pointed at `evaluator.dart's
_as` (now removed), and `PipeOpParseKind.custom` claimed the spec
provides shape metadata only.

Why it matters

Concrete payoff: PR #10 commit 5 had to fix the same null
short-circuit bug in two places — `inferPipeOpShape` and the
`As` branch in `infer.dart` — because they were parallel
implementations. After this PR there's one place. Future invariants
land once.

The spec table's design comment ("Single source of truth for pipe-op
metadata, runtime, and parsing") becomes accurate instead of
aspirational.

Test plan

Notes

…r import

`check.dart` previously imported the parser to validate four curated
remediation source strings into [LamExpr] templates at construction
time. Those four templates are constants in practice — `{items: .}`,
`{value: .}`, `to_entries`, and `{value: .} | to_entries` — and
hand-building them as `const LamExpr` values gives the same result
without dragging the parser into the import graph.

Why it matters: `pipe_ops.dart` is consumed by the parser, so the
chain `pipe_ops → synthesize → check → parser → pipe_ops` was
cyclic. That cycle prevented `pipe_ops.dart` from importing
`synthesize.dart`, which forced the spec table's `_asSpec` entry
to live as a stub while the real `as` inference was special-cased
in `infer.dart`. Breaking the cycle here is the precondition for
unifying `As` through the spec table.

Side effect: the public `Remediation()` and `Remediation.withDisplay()`
factories that took a `source` string are removed. Internal callers
exclusively used `Remediation._` (the private factory). The public
type and its read-only fields stay exported, so destructuring callers
in `bin/lam.dart` are unaffected. Pre-1.0 breaking change; if a
downstream consumer needs string-to-`Remediation` they can compose
`parseQuery` and a public hand-built constructor in their own code.

No behavior change. All 1673 tests pass.
The spec table in pipe_ops.dart was already declared as the single
source of truth for pipe-op metadata, runtime, and parsing — but `as`
violated this on three axes:

1. `infer.dart` had an `if (expr is As)` branch that bypassed
   `inferPipeOpShape` to call a private `_asShape` helper. Any
   invariant added to `inferPipeOpShape` (e.g. the null short-circuit
   added in PR #10 commit 5) had to be re-implemented at the As branch
   or it would not apply.

2. `evaluator.dart` had a similar `As(:final target) => _as(...)`
   branch with its own private `_as` runtime that duplicated the
   bridge-resolution logic.

3. The spec's `_asSpec` entry was a stub: `infer` returned SAny,
   `eval` threw, both with comments saying "the real logic lives in
   <other file>".

This commit makes `_asSpec` a real spec entry. Both `infer` and
`eval` now consult the synthesis table directly via the imports
unblocked by the previous commit's cycle break. The dispatch site
becomes uniform: every pipe op (BuiltinPipeOp and As) flows through
`evalPipeOp` and `inferPipeOpShape`, and per-op invariants apply
to every op without per-AST branches.

To make `_asSpec.eval` able to read its typed `As.target`
argument, the `PipeOpInfo.eval` signature changes from
`(ctx, args, eval)` to `(ctx, op, eval)`. Specs that need
arguments destructure the AST themselves: `(op as BuiltinPipeOp).args[0]`
for the generic case, `(op as As).target` for the typed case. Nine
existing specs are updated mechanically; the 19 zero-arg specs use
`(_, _, _)` and need no body change.

`evalBuiltinPipeOp` is renamed to `evalPipeOp` and now accepts
any LamExpr, dispatching via `pipeOpInfoFor`. The evaluator's
switch case becomes `BuiltinPipeOp() || As() => evalPipeOp(...)`.

Removed:
- `_as` helper and its imports in evaluator.dart
- `_asShape` helper and its imports in infer.dart
- The `if (expr is As)` branch in inferShape

No behavior change. All 1673 tests pass, including the nine
null-short-circuit explain tests added in PR #10 commit 5 — those
now pass via the unified spec path instead of the special-case As
branch in infer.dart.
…patch

The As class doc described the runtime as living in evaluator.dart's
`_as` (now removed). The PipeOpParseKind.custom enum doc said the
spec provides shape metadata only. Both reflected the pre-refactor
world. Updated to describe the actual dispatch model: As keeps a
dedicated AST class for its typed argument, but inference and runtime
flow through the spec table.
@hakimjonas
Copy link
Copy Markdown
Owner Author

Performance regression check

Built a one-shot eval benchmark, AOT-compiled it on both the PR #10 tip
(baseline) and this branch (refactor), ran 15 fresh-process samples per
(scenario, size) pair, and reported the median of medians. Bench scripts
were used locally only and not committed (per the agreed plan).

Methodology

  • 6 scenarios: filter, map, sort_by, group_by, unique, as_toml
  • 3 sizes: 1K, 10K, 100K records
  • 200/100/30 in-process iterations per scenario at each size
  • 15 fresh-process runs per (scenario, size); reported value is median-of-medians
  • AOT-compiled via dart compile exe
  • Same machine, same kernel, no other significant load during runs

Results

scenario/size baseline µs refactor µs Δ µs Δ %
filter/1000 116 117 1 +0.86%
filter/10000 1236 1277 41 +3.32%
filter/100000 24297 24722 425 +1.75%
map/1000 91 93 2 +2.20%
map/10000 999 1020 21 +2.10%
map/100000 21138 21739 601 +2.84%
sort_by/1000 595 608 13 +2.18%
sort_by/10000 7833 7859 26 +0.33%
sort_by/100000 117456 117877 421 +0.36%
group_by/1000 170 171 1 +0.59%
group_by/10000 1728 1731 3 +0.17%
group_by/100000 29069 29034 -35 -0.12%
unique/1000 268 270 2 +0.75%
unique/10000 2789 2763 -26 -0.93%
unique/100000 40715 41186 471 +1.16%
as_toml/1000 66 68 2 +3.03%
as_toml/10000 719 728 9 +1.25%
as_toml/100000 13756 14082 326 +2.37%

Read

Median delta across the matrix is +1.16%. All deltas land within ±3.3%.

For context on noise: a stable-workload scenario like sort_by/100000
shows a ~2% intra-branch spread across the 15 samples (baseline range
116083–118303 µs, refactor range 116191–119710 µs). For lighter
scenarios like filter/10000, the per-run spread is wider — baseline
range 1192–1420 µs (19% spread), refactor range 1242–1727 µs (39%
spread, with one outlier at 1727 likely from a GC pause).

The 1–3% deltas are at or below the per-scenario noise floor for most
sizes. The "+3.32% on filter/10000" reading sits inside the
intra-branch variance band, so it can't be cleanly attributed to the
refactor versus run-to-run scheduling noise. The same holds for the
small positive bias across most cells: it's consistent with the one
extra pipeOpInfoFor runtime type test per pipe-op invocation, but the
magnitude is too small to claim with confidence above noise.

Verdict

No measurable regression beyond what the new dispatch's extra type
test and downcast would predict — single-digit-nanosecond overhead per
op, lost in real-query GC noise. The architectural cleanup ships
without a perf cost worth flagging.

The BuiltinPipeOp round-trip test skipped `custom` parseKind ops, so
the As node's resolution through the unified spec-table dispatch was
unpinned — the central invariant of this PR. Add three cases:

- As(fmt) resolves through pipeOpInfoFor to the `as` spec for every
  OutputFormat (the missing half of the round-trip test).
- evalPipeOp runs an As node end-to-end via the spec's eval field.
- evalPipeOp rejects a non-pipe-op AST with the programmer-error
  QueryError, covering the reworded guard message.
@hakimjonas hakimjonas merged commit 40c55bd into fix/null-safe-pipe-ops-and-skill-flag Jun 1, 2026
hakimjonas added a commit that referenced this pull request Jun 1, 2026
… test (#10)

* fix(eval): null|type returns "null" string, not null (Pipe short-circuit)

`_pipe` in evaluator.dart unconditionally short-circuited on null
inputs: `if (input == null) return null`. That implements the "navigation
on null returns null" contract (correct for `.field`, `flatten`,
`length`, `map`, …) but breaks ops whose semantics are *defined* over
a null context. The canonical case is `type`, whose `--explain` shape
is `SString` and whose spec eval has a deliberate `null => 'null'`
branch — but that branch was unreachable from any `Pipe` AST node
because `_pipe` returned null before the spec was consulted.

Concretely on lambé 0.11.0:

  $ lam -n 'null | type'
  null                              # JSON null, not the string "null"

  $ lam -n '(null | type) == "null"'
  false

  $ lam -n 'null | type | length'
  null                              # short-circuit cascades

The fix adds `nullSafePipeOpNames`, a set of op names that opt out of
the short-circuit. `_pipe` consults it; if the right-hand op is a
BuiltinPipeOp whose name is in the set, the null is forwarded to its
`eval`. Currently the set contains only `type`. Adding an op is a
single-line change in pipe_ops.dart.

`has` is a related candidate but its spec eval throws on non-map/non-list
context rather than returning false. Promoting `has` to nullSafe would
change the user-visible behaviour from "returns null" to "throws
QueryError", which is a separate design call. Out of scope for this fix.

Tests pin the bug at the Pipe AST level: the existing `query('type', null)`
test passed because it bypasses `Pipe` entirely (top-level evaluation
of a bare `type` expression doesn't go through `_pipe`). Three new
tests in `test/to_number_type_test.dart` exercise the Pipe form
(`null | type`, `.missing | type`, and a chained `null | type | length`)
that would have caught this regression.

Verified: 1656 existing tests still pass; the 1 failing test
(`--ndjson stdin streaming`) is a pre-existing timing flake on this
machine, reproduces on plain main without this change.

* docs+parser: correct boolean-keyword claim; add `def` non-goal hint

Two small fixes that surface together:

1. `.agents/skills/lambe/SKILL.md` previously told agents not to write
   `and` / `or` / `not` and claimed the parser would reject them.
   That is correct for `not` but wrong for `and` and `or` — both are
   documented keyword aliases for `&&` / `||` (lib/src/parser.dart
   lines 16–20). Updated the skill to accurately describe which forms
   are aliased and which require the punctuation form.

2. `def` was the only common jq idiom without a tailored
   `_jqIdiomHint`. The query `def f: . + 1; f` produced the generic
   "expected ..." token list with no guidance. Added a hint that
   names the non-goal explicitly: lambé is a bounded tree transformer;
   no `def`, no recursion, no closures.

A pinned test in `test/parse_error_format_test.dart` exercises the
`def` redirect to keep it from regressing.

* feat(cli): add --skill flag printing the embedded SKILL.md

`lam --skill` writes the embedded `.agents/skills/lambe/SKILL.md` to
stdout and exits. Lets agent harnesses install the skill regardless of
how `lam` was acquired:

  lam --skill > .agents/skills/lambe/SKILL.md

This addresses an adoption gap: `dart pub global activate lambe` does
not ship the `.agents/` directory, so pub-installed users have no way
to discover the skill on disk. The release-binary path (install.sh
from the GitHub release) includes the binary but not the skill file
either. Embedding closes both gaps with a single command and a
filesystem-free implementation that works in any execution model
(AOT, JIT, future WASM CLI).

Mechanics mirror `lambeVersion`:

- New `tool/gen_skill.dart` reads `.agents/skills/lambe/SKILL.md` and
  emits `lib/src/_skill.dart` containing a `const String lambeSkill`.
- The generator is committed-output: `lib/src/_skill.dart` lives in
  the tree so `dart compile exe` works from a fresh checkout without
  a generator pre-step. Re-run after editing the skill source.
- Generator uses a raw triple-quoted Dart string (`r'''...'''`) and
  fails loudly if the skill ever contains `'''` so the embedding
  scheme stays simple.
- `bin/lam.dart` adds an `--skill` flag, handled before any
  input-acquisition logic so it works without stdin or a file
  argument.
- Three CLI integration tests pin the wiring: header shape,
  markdown round-trip via `lam -f markdown`, and byte-for-byte
  parity with the on-disk source (catches "regenerated _skill.dart
  was not committed" regressions).
- `doc/lam.1.md` documents the flag; `doc/lam.1` regenerated from it.

`lambeSkill` is also exported from `package:lambe` so library consumers
can read the same string without shelling out.

* test(cli): replace wall-clock streaming assertion with lockstep deadlock check

The --ndjson stdin streaming test asserted that the second output line
arrived ≥150 ms before EOF, then ran a parallel timer to prove the time
gap. Both halves were sensitive to host CPU scheduling and stdio
buffering, so the test was marked CI-skipped. Useful signal lost.

The replacement reframes the property: streaming means N inputs produce
N outputs in lockstep. The test now feeds one ndjson line, awaits its
output line, then feeds the next, etc. A buffered implementation never
produces the first output before EOF, so the per-line wait times out
and the test fails with a clear streaming-broken signal. No wall-clock
comparison; no CI skip.

The helper avoids `package:async` by hand-rolling a completer queue
front of the stdout LineSplitter.

* fix(shape): model Pipe null short-circuit in inference and explain

The runtime [Pipe] evaluator short-circuits on null: a null left-hand
side returns null without invoking the right-hand op (except for ops
listed in [nullSafePipeOpNames] like `type`). This is the documented
"navigation on null returns null" contract and is pinned by tests in
null_propagation_test.dart.

The static analyser did not model this, with two visible consequences:

1. `inferPipeOpShape` saw a null input and returned [SAny] (since
   most ops' `accepts` predicate rejects null), so `null | length`
   inferred to 'any' instead of 'null'.

2. `_analyzeRejection` then saw the not-accepted predicate and
   emitted "length rejects null; this will throw at runtime" — a
   warning that contradicts the documented contract and the actual
   runtime behaviour.

Same shape of bug in `infer.dart`: `as(target)` bypasses
`inferPipeOpShape` to call the synthesis table directly, so even
after fixing the spec-table path, `null | as(toml)` still inferred
to `map<value: null>` and falsely listed TOML/HCL as writable.

Fix: thread the null short-circuit through both paths.
- `inferPipeOpShape`: when input is [SNull] and the op is not in
  [nullSafePipeOpNames], return [SNull].
- `_analyzeRejection` in explain.dart: suppress the "will throw at
  runtime" warning under the same condition.
- `inferShape` in infer.dart: gate the `As` branch the same way
  before reaching `_asShape`.

Nine pinned tests in shape_explain_test.dart cover length / has /
keys / sum / type / type|length / as(toml) / as(csv) and a negative
test (non-null shape that the op rejects must still warn).

* ci: verify generated files are in sync with their sources

`lib/src/_version.dart` (from pubspec.yaml), `lib/src/_skill.dart`
(from .agents/skills/lambe/SKILL.md), and `doc/lam.1` (from
doc/lam.1.md) are checked-in artifacts produced by tools in tool/.
Without a CI guard, a SKILL.md edit can land without regenerating
the embedded copy and `lam --skill` ships stale content until the
next manual release-prep run. The runtime byte-parity test in
cli_integration_test.dart only catches drift after a regression
has already merged.

New `generated-files-in-sync` job re-runs the three generators
and fails the build if any of the three files differ from the
committed copies. Output includes the diff and the exact commands
to run locally.

* style: dart format two files left unformatted in earlier commits

`test/to_number_type_test.dart` (added in commit a70b8d1) and
`tool/gen_skill.dart` (added in commit cc20b68) were created without
running `dart format` so the project-wide format CI job fails on
this branch. Same content, just reflowed to dart-format conventions.

Catches up the format CI before this branch is reviewed; the new
`generated-files-in-sync` job verified earlier in the branch is
unrelated and was already passing.

* refactor: unify As through the spec table (#11)

* refactor(shape): hand-build curated remediation templates, drop parser import

`check.dart` previously imported the parser to validate four curated
remediation source strings into [LamExpr] templates at construction
time. Those four templates are constants in practice — `{items: .}`,
`{value: .}`, `to_entries`, and `{value: .} | to_entries` — and
hand-building them as `const LamExpr` values gives the same result
without dragging the parser into the import graph.

Why it matters: `pipe_ops.dart` is consumed by the parser, so the
chain `pipe_ops → synthesize → check → parser → pipe_ops` was
cyclic. That cycle prevented `pipe_ops.dart` from importing
`synthesize.dart`, which forced the spec table's `_asSpec` entry
to live as a stub while the real `as` inference was special-cased
in `infer.dart`. Breaking the cycle here is the precondition for
unifying `As` through the spec table.

Side effect: the public `Remediation()` and `Remediation.withDisplay()`
factories that took a `source` string are removed. Internal callers
exclusively used `Remediation._` (the private factory). The public
type and its read-only fields stay exported, so destructuring callers
in `bin/lam.dart` are unaffected. Pre-1.0 breaking change; if a
downstream consumer needs string-to-`Remediation` they can compose
`parseQuery` and a public hand-built constructor in their own code.

No behavior change. All 1673 tests pass.

* refactor: unify As through the spec table; eval takes the AST node

The spec table in pipe_ops.dart was already declared as the single
source of truth for pipe-op metadata, runtime, and parsing — but `as`
violated this on three axes:

1. `infer.dart` had an `if (expr is As)` branch that bypassed
   `inferPipeOpShape` to call a private `_asShape` helper. Any
   invariant added to `inferPipeOpShape` (e.g. the null short-circuit
   added in PR #10 commit 5) had to be re-implemented at the As branch
   or it would not apply.

2. `evaluator.dart` had a similar `As(:final target) => _as(...)`
   branch with its own private `_as` runtime that duplicated the
   bridge-resolution logic.

3. The spec's `_asSpec` entry was a stub: `infer` returned SAny,
   `eval` threw, both with comments saying "the real logic lives in
   <other file>".

This commit makes `_asSpec` a real spec entry. Both `infer` and
`eval` now consult the synthesis table directly via the imports
unblocked by the previous commit's cycle break. The dispatch site
becomes uniform: every pipe op (BuiltinPipeOp and As) flows through
`evalPipeOp` and `inferPipeOpShape`, and per-op invariants apply
to every op without per-AST branches.

To make `_asSpec.eval` able to read its typed `As.target`
argument, the `PipeOpInfo.eval` signature changes from
`(ctx, args, eval)` to `(ctx, op, eval)`. Specs that need
arguments destructure the AST themselves: `(op as BuiltinPipeOp).args[0]`
for the generic case, `(op as As).target` for the typed case. Nine
existing specs are updated mechanically; the 19 zero-arg specs use
`(_, _, _)` and need no body change.

`evalBuiltinPipeOp` is renamed to `evalPipeOp` and now accepts
any LamExpr, dispatching via `pipeOpInfoFor`. The evaluator's
switch case becomes `BuiltinPipeOp() || As() => evalPipeOp(...)`.

Removed:
- `_as` helper and its imports in evaluator.dart
- `_asShape` helper and its imports in infer.dart
- The `if (expr is As)` branch in inferShape

No behavior change. All 1673 tests pass, including the nine
null-short-circuit explain tests added in PR #10 commit 5 — those
now pass via the unified spec path instead of the special-case As
branch in infer.dart.

* docs: update ast.dart and pipe_ops.dart comments to match unified dispatch

The As class doc described the runtime as living in evaluator.dart's
`_as` (now removed). The PipeOpParseKind.custom enum doc said the
spec provides shape metadata only. Both reflected the pre-refactor
world. Updated to describe the actual dispatch model: As keeps a
dedicated AST class for its typed argument, but inference and runtime
flow through the spec table.

* test(shape): pin As through unified evalPipeOp dispatch

The BuiltinPipeOp round-trip test skipped `custom` parseKind ops, so
the As node's resolution through the unified spec-table dispatch was
unpinned — the central invariant of this PR. Add three cases:

- As(fmt) resolves through pipeOpInfoFor to the `as` spec for every
  OutputFormat (the missing half of the round-trip test).
- evalPipeOp runs an As node end-to-end via the spec's eval field.
- evalPipeOp rejects a non-pipe-op AST with the programmer-error
  QueryError, covering the reworded guard message.
@hakimjonas hakimjonas deleted the refactor/unify-as-into-spec-table branch June 1, 2026 07:26
@hakimjonas hakimjonas mentioned this pull request Jun 1, 2026
7 tasks
hakimjonas added a commit that referenced this pull request Jun 1, 2026
…#12)

PR #10 fixed six bugs against the 0.11.0 binary surfaced by an audit
against the published skill. PR #11 (stacked) addressed the underlying
smell: `as` was the lone pipe op outside the spec table, so per-op
invariants like the null short-circuit had to be reimplemented at its
callsite. Both PRs landed before the release tag; this commit bumps
versions, regenerates embedded artefacts, and corrects the diacritic
spelling of "Lambë" (the umlaut form is the Quenya original; both an
acute-form typo and a bare-ASCII typo had spread across docs and
prose).

Test count: 1676. Breaking: PipeOpInfo.eval signature and the public
Remediation source-string factories.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant