Skip to content

Lambé 0.9.0#7

Merged
hakimjonas merged 67 commits into
mainfrom
test/rumil-0.7
May 23, 2026
Merged

Lambé 0.9.0#7
hakimjonas merged 67 commits into
mainfrom
test/rumil-0.7

Conversation

@hakimjonas
Copy link
Copy Markdown
Owner

@hakimjonas hakimjonas commented May 23, 2026

Summary

The schema-feedback-loop release. Declare a JSON Schema, check queries
against it, round-trip schemas with the ecosystem. Plus: a 27-class
pipe-op AST consolidation, a rumil_tokens-based REPL highlighter,
the text op for markdown prose extraction, a -n / --null-input
flag, richer --explain warnings, JSON-string keys in object
construction, and an end-to-end CLI ~3.3× faster than 0.8.0 on
parse-bound workloads.

Highlights:

  • Pipe-op AST consolidation. The 27 per-op AST classes
    (FilterOp, MapOp, SortOp, …) collapse into a single
    BuiltinPipeOp(name, args); pipe_ops.dart's spec table is now
    the only place per-op behaviour lives.
  • REPL highlighter on rumil_tokens. The 100-line hand-rolled
    tokenizer is gone; the highlighter consumes a typed token stream.
    Pipe op names colour as keywords; redraw on every keystroke.
  • Markdown text op. Walks a node tree and concatenates prose
    recursively, replacing the structurally broken .children[0].text
    pattern. Soft breaks become ' ', hard breaks become '\n'.
  • Schemas as a first-class contract. --schema <path> takes a
    JSON Schema; --print-shape emits one. lambe_check,
    lambe_explain, lambe_print_shape MCP tools.
  • Performance. lam --print-shape big.json (1.5 MB JSON):
    2.4 s → 732 ms (3.28×). Inherited from rumil 0.7's combinator
    work + rumil_parsers 0.8.0's JSON AST split + capture-based
    parsing.

Full release notes in CHANGELOG.md.

Test plan

  • dart test — 1653/1653 pass
  • dart analyze — clean
  • dart format --set-exit-if-changed . — clean
  • dart pub publish --dry-run — 0 warnings, 211 KB compressed
  • pana --no-warning --json — 160/160
  • tool/lint_changelog.sh — all invariants pass
  • CHANGELOG self-validation gated in CI
  • Live REPL highlighter smoke test — verified colours, completion,
    text op, :print-shape, history, JSON-string keys
  • CLI bench rerun — 3.28× / 3.37× on parse-bound workloads, no
    regressions
  • Resolves against published rumil_parsers ^0.8.0

Companion release

rumil_parsers 0.8.0 shipped on pub.dev today carries the JSON AST
split (JsonNumberJsonInt | JsonDouble), HCL AST split
(HclNumberHclInt | HclDouble), the HCL decoder N=1-vs-N≥2 fix,
capture-based number/string parsing, and the common.floatingPoint
precision fix that YAML inherits.

hakimjonas added 30 commits May 2, 2026 21:14
An opt-in escape hatch for CSV/TSV: non-scalar cells encoded as JSON
strings inline instead of refused. Default stays at 0.8.0's refuse
behavior.

Core
- CellPolicy { refuse, json } enum in output_format.dart
- formatOutput, canWriteAs, canWriteShapeAs, requirementFor, explain
  all take an optional CellPolicy flattenCells = CellPolicy.refuse
- Under json, requirementFor(csv/tsv) widens MustBeFlatList to
  MustBeList; the writer JSON-encodes list- or map-valued cells via
  const JsonEncoder().convert(cell); the shape check accepts any list
  at the root
- _scalarCell renamed to _cell (no longer always-scalar)
- as(fmt) combinator deliberately does NOT read the CLI/REPL/MCP
  policy; stays a pipeline-level transform so queries remain portable

NotWritable.hints (new field)
- List<String> hints on NotWritable, default const []
- _hintsFor populates one hint when: format is csv/tsv, policy is
  refuse, and the root shape is already SList (so only the cells are
  the problem). Hint text names all three surfaces: --flatten-cells
  json (CLI), :flatten-cells json (REPL), flatten_cells=json (MCP)
- OutputShapeError.hints getter; _render appends each hint on its
  own line after the suggestion list
- Uniform channel means CLI, REPL, and MCP render the same guidance
  without re-deriving the condition

--explain
- explain() takes CellPolicy; threads into canWriteShapeAs for the
  writability lists
- ExplainReport.flattenCells field round-trips the policy
- renderExplain emits "Cell policy: json" footer only when non-default,
  so default output is byte-for-byte unchanged

CLI (bin/lam.dart)
- --flatten-cells option, allowed [refuse, json], defaults to refuse
- Threaded into _writeWithBridge and the --explain path

REPL (lib/src/repl.dart)
- :flatten-cells <policy> session command with validation
- Threaded through _formatResult, _encode, _handleShapeError
- :help entry

MCP (bin/mcp_server.dart)
- flatten_cells parameter on lambe_query inputSchema
- Threaded into formatOutput; JSON bypass path unchanged
- hints key in _renderShapeErrorPayload

Docs
- doc/lam.1.md: --flatten-cells option and :flatten-cells REPL command
- doc/lam.1: regenerated via tool/manpage.dart
- CHANGELOG.md: new 0.9.0-dev section
- README.md: non-scalar-cells subsection + CLI example

Tests (+106)
- csv_element_shape_test.dart: 5 hint tests
- shape_explain_test.dart: 4 CellPolicy threading tests
- shape_output_consistency_test.dart: 97-case hint matrix (every
  representative value × every format, verifying hints fire exactly
  for csv/tsv refuse + SList root)

Quality gates: dart analyze clean, 1256 tests pass (was 1150),
dart format clean, pana 160/160, manpage round-trip matches.
…ring

Pre-commit audit on track D caught a real bug in NotWritable.hints
before track B cements the pattern.

The problem: hints were List<String> with CLI/REPL/MCP syntax baked
into a single string. An MCP agent receiving the error got
"--flatten-cells json (CLI) / :flatten-cells json (REPL) / flatten_cells=json (MCP)"
as an undifferentiated blob, and had to string-parse to find the
actionable parameter. A REPL user saw CLI flag syntax they could not
type; a CLI user saw REPL colon-commands.

The fix: structured Hint type in lib/src/shape/check.dart, exported
from package:lambe/lambe.dart. Each hint carries label, cliFlag,
replCommand, mcpParameter (a (String, String) record), and
explanation. Each surface renders only its native form:

- OutputShapeError.message: no hints baked in (stays surface-neutral).
- CLI (bin/lam.dart): writes "Or pass ${cliFlag}: ${explanation}" to
  stderr after the error message, via _writeHintsCli.
- REPL (lib/src/repl.dart): writes "Or run ${replCommand}: ${explanation}"
  in _handleShapeError, before the bridge prompt.
- MCP (bin/mcp_server.dart): emits structured JSON
  {label, parameter, value, explanation} in the payload.

Tests updated to match:
- csv_element_shape_test.dart hint tests now check Hint fields.
- shape_output_consistency_test.dart hint matrix pins cliFlag value.
- Added an explicit assertion that OutputShapeError.message does NOT
  bake any of the three surface syntax forms.

1256 tests pass, pana 160/160. Not tested: surface-level rendering
(that CLI stderr contains the hint line, that MCP payload shape
matches). Manually verified; a regression-proof test belongs in the
end-of-four-tracks audit.
Evaluate each line of ndjson/jsonl input as an independent JSON
document, no shared state between lines, one compact JSON result per
line out. Covers the "tail a log" use case at the CLI layer without
touching the core "AST over in-memory tree" model.

Library
- New `queryNdjson(Iterable<String> lines, LamExpr ast)` in
  lib/lambe.dart. Lazy via `sync*` so a caller (or a pipe into `take`)
  can pull only as many results as needed; fail-fast with a `line N:`
  prefix on the first parse or evaluation error. Empty and
  whitespace-only lines are skipped silently.

CLI
- New `--ndjson` flag in bin/lam.dart. Auto-enabled when the file
  extension is `.ndjson` or `.jsonl`, consistent with the existing
  auto-detection convention for .csv, .yaml, etc.
- File input reads all lines eagerly (bounded size). Stdin uses a
  lazy `sync*` iterator so `tail -f app.log | lam --ndjson '.level'`
  emits each result as the line arrives — verified with a
  time-stamped streaming test (line N emerges with N*0.5s delay).
- Rejects combining --ndjson with --interactive, --schema, --assert,
  --explain, or --to <non-json>. The mode is narrow on purpose;
  other output formats and non-execution modes don't combine
  sensibly with per-line eval.

Tests
- New test/ndjson_test.dart: 14 tests covering basic per-line
  evaluation, empty-line skipping, parse/eval error annotation with
  line numbers, lazy iteration (results yielded before later error),
  and complex pipe queries per line.

Docs
- doc/lam.1.md: --ndjson option block and a "line-delimited JSON"
  example.
- doc/lam.1: regenerated.
- CHANGELOG.md: new bullet under 0.9.0-dev Added.
- README.md: CLI example.

Quality gates: dart analyze clean, 1270 tests pass (was 1256, +14),
dart format clean, pana 160/160, manpage round-trip matches.
Audit after track C found that track C and track D had strong
library-level unit tests but no coverage for the wiring that actually
exposes them to users: the CLI argument parsing, the ndjson file-
extension auto-detect, the mode-combination guards, the stdin
streaming claim, and the MCP payload shape. Manual smoke tests are
not regression-proof; a wiring regression would ship silently.

Changes:

lib/src/mcp_payload.dart (new, factored out of bin/mcp_server.dart):
  renderMcpShapeErrorPayload takes an OutputShapeError + expression
  and returns the JSON string an MCP agent receives. Pure function,
  no I/O, testable without starting the MCP server as a subprocess.

bin/mcp_server.dart: calls the library function; private method
  _renderShapeErrorPayload removed.

lib/lambe.dart: exports renderMcpShapeErrorPayload.

test/mcp_payload_test.dart (new, 5 tests):
  - Payload parses as JSON with all documented keys.
  - Suggestions carry 1-based ids and composed `apply_as` queries.
  - Hints carry structured {parameter, value} pairs.
  - Hints do NOT leak CLI or REPL syntax into the agent-facing JSON.
  - Empty hints still expose an empty list (key always present).

test/cli_integration_test.dart (new, 18 tests):
  Shells out to `dart bin/lam.dart` with Process.start. Coverage:
  - Explicit --ndjson flag produces per-line compact JSON.
  - .ndjson and .jsonl file extensions auto-enable the mode.
  - Stdin with --ndjson works via pipe.
  - Empty and whitespace-only lines skipped silently.
  - Malformed line exits 1 with "line N" in stderr.
  - File-not-found exits 1 with a clear error.
  - Five mode-combo guards: --ndjson rejects --interactive, --schema,
    --assert, --explain, --to yaml. Accepts --to json (redundant).
  - Streaming: four stdin lines with 500ms gaps, asserts the
    last two inter-output gaps are >= 300ms. A buffered
    implementation would deliver all four near EOF with near-zero
    gaps. Proves tail -f | lam --ndjson emits as lines arrive.
  - --flatten-cells refuse writes CLI-form hint (--flatten-cells
    json) to stderr, NOT REPL or MCP syntax. Regression guard for
    the surface-specific rendering chosen in the track D audit.
  - --flatten-cells json produces CSV with JSON-encoded cells.
  - --explain --flatten-cells json widens writable formats and
    prints "Cell policy: json" footer.
  - --explain without the flag: no footer, csv in "Not writable as".

What's deliberately not covered:
  - REPL I/O (ReadLine-driven, not testable without a real TTY).
  - Exact error message phrasing (substring assertions only, so
    phrasing can improve without breaking tests).
  - MCP server subprocess JSON-RPC (the payload function it calls is
    tested directly; the server wiring is a single dart_mcp method).

Quality gates: dart analyze clean, 1293 tests pass (was 1270, +23),
dart format clean, pana 160/160, manpage round-trip matches.
Three sub-features added to the 0.8.0 explain infrastructure:

Runtime-rejection warnings (always on)
  Pipe-op acceptance predicates in pipe_ops.dart already know which
  input shapes each op rejects. Explain now surfaces the mismatch
  statically: `.config | filter(.x)` on a known map produces
  "filter rejects map<...>; this will throw at runtime". SAny inputs
  are ignored (cannot prove); compatible inputs pass silently. The
  new _analyzeRejection helper runs alongside the existing
  _analyzePredicate in explain()'s per-stage loop.

Trivial-result warnings (opt-in)
  For sort_by, group_by, map, unique_by: when the argument references
  a field provably absent from the element shape, emit a warning
  saying "the result is trivial". Reuses _missingFieldPath (the
  helper that already powers empty-filter warnings) on the element
  shape of the input list. Opt-in via explain(..., includeTrivial:
  true) because legitimate uses exist (stable no-op sort, explicit
  null projection).

Structured JSON output
  renderExplainJson(ExplainReport) emits the full report as JSON with
  snake_case keys (stages, warnings, writable_as, not_writable_as,
  flatten_cells). Warning kinds serialize as empty_filter,
  runtime_rejection, trivial_result. Shapes render as strings via
  renderShape; agents that need structural shape access should call
  the lambe_schema MCP tool separately. Text output from
  renderExplain is unchanged byte-for-byte; JSON is pure-additive.

Supporting API changes
  - WarningKind enum: emptyFilter, runtimeRejection, trivialResult.
  - ExplainWarning.kind field (required at construction).
  - explain() gains `bool includeTrivial = false` parameter.

CLI wiring (bin/lam.dart)
  - --explain-trivial flag: implies --explain, enables trivial class.
  - --explain-json flag: implies --explain, switches to JSON renderer.
  - Both compose: --explain-trivial --explain-json emits JSON
    including trivial_result warnings.
  - --ndjson rejection of --explain remains correct (covers the
    implies cases via the existing guard).

Docs
  - doc/lam.1.md: two new option blocks, --explain description extended.
  - doc/lam.1: regenerated.
  - CHANGELOG.md: 0.9.0-dev bullet covering all three sub-features.
  - README.md: one paragraph added to the --explain section.

Tests (+24)
  shape_explain_test.dart (+17):
    - 4 runtime-rejection cases (filter/sum on map, SAny untouched,
      compatible input untouched).
    - 6 trivial-result cases (sort_by/group_by/map flagged when opt-in,
      NOT flagged by default, existing field untouched, SAny element
      cannot prove).
    - 6 JSON renderer cases (top-level shape, stage/warning/
      writability fields, snake_case kind names, flatten_cells).
  cli_integration_test.dart (+7): runtime-rejection in default output,
    trivial-result gated on --explain-trivial, --explain-json shape,
    the "implies --explain" behavior for both sub-flags, combined
    usage, and the --ndjson --explain-json rejection path.

Quality gates: dart analyze clean, 1317 tests pass (was 1293, +24),
dart format clean, pana 160/160, manpage round-trip matches.
Post-track-B audit caught two real gaps; one closed, one deferred
with documentation.

Structured shapes in --explain-json
  renderExplainJson previously emitted stage shapes as text strings
  via renderShape ("list<map<name: string>>"). Agents consuming the
  JSON had to re-parse that text to access structure, defeating the
  point of a JSON mode. Fixed by adding shapeToJson(Shape) in
  lib/src/shape/shape.dart, a sealed-ADT walk that produces
  {kind, ...} nested trees:
    {"kind": "list", "element": {"kind": "map", "fields": {...}}}
  renderExplainJson now uses this form. Text output from
  renderExplain is byte-for-byte unchanged. Exported from the
  library barrel.

Rejection cascade coverage
  New test in shape_explain_test.dart verifies the interaction
  between runtime-rejection warnings and inferShape's SAny widening:
  `. | filter(.a) | sort` starting from an SMap produces exactly one
  rejection warning (on filter, stage 1). Sort sees the post-filter
  ctx as SAny and does NOT emit its own rejection. Prevents
  double-warning regressions.

REPL surface verified manually
  :flatten-cells colon-command, session-state persistence, and the
  REPL-native hint rendering (":flatten-cells json" not
  "--flatten-cells json") all verified in a real session on a
  list-of-maps-with-lists fixture. A ReadLine-seam refactor would be
  needed for automated REPL tests; documented in memory as a known
  gap accepted for 0.9.0.

Tests (+9 total)
  shape_test.dart (+7): shapeToJson on every Shape constructor plus
    nested round-trips, empty map, empty list, and JSON
    round-trippability.
  shape_explain_test.dart (+1 new + 1 rewrite):
    - Rewrote the "stages carry shape" test to assert the structured
      {kind: list, element: {kind: string}} form instead of the old
      string.
    - New "rejection cascade" test pinning single-warning behavior.

Quality gates: dart analyze clean, 1325 tests pass (was 1317),
dart format clean, pana 160/160.
Decision record for the schema-as-contract feature. Resolves the
design questions from the handover plus several the handover didn't
raise.

Format: JSON Schema subset (not custom DSL). Subset is
type/properties/items/required. Value-level constraints
(minimum/pattern/enum/etc) are rejected at load time with
per-keyword errors. Structural combinators (allOf/oneOf/$ref/if-then
/dependencies) rejected. Unknown keywords ignored per JSON Schema
extensibility convention.

Key call: rumil_parsers.parseJson does the JSON parsing for free
with line-aware errors, so the parser collapses to ~50 lines of
exhaustive switch on JsonValue. My earlier "Lambe DSL is cheaper"
argument died once I accounted for that.

Shape ADT: SOptional(Shape) added. Required by JSON Schema's
`required` semantics — shipping without it would silently lie
whenever users have optional fields. Termination and the bounded-
language contract are preserved: SOptional lives in the static
analyzer, not the query language.

Disagreement: schema augments shapeOf(data); error on concrete-type
conflict. Keeps --explain honest. Structural validation falls out as
a side effect; no separate --validate command for 0.9.0.

CLI: rename --schema to --print-shape (first breaking change in
0.9.0); add --schema <path>. --print-shape output becomes JSON
Schema, round-trippable with --schema input. Sibling convention:
data.json paired with data.schema.json.

MCP: new schema parameter on lambe_query; rename lambe_schema to
lambe_print_shape; new lambe_check tool for on-demand validation.

Explicit non-goals called out: no runtime coercion, no value-level
constraints, no conditional schemas, no external $ref, no templating.
Lambe is not CUE and shouldn't try to be.

Implementation plan: SOptional first (compiler finds all the switch
sites), then parser/loader/merge, then CLI/REPL/MCP wiring, then
tests and docs. Estimated ~1 week.

Positioning sharpened via research: Lambe is "a query language for
structured data that shows you what you're working with" — use it
when you don't already know the data. Not "typed jq" (that market
never materialized in 10 years). Not "parity with CUE" (different
audience). The shape feedback loop is the actual win.
Add SOptional(Shape) to the sealed shape ADT. This is the variant
JSON Schema's `required` semantics demand and the shape the wider
"shape as feedback loop" positioning needs. Shipping the schema
feature without it would silently misrepresent optional fields.

Constructor semantics
  SOptional(SOptional(x)) collapses to SOptional(x) via the factory.
  Guarantees no stacked optionality anywhere, so downstream code
  never has to handle the degenerate case.

Acceptance semantics (op predicates)
  Optional unwraps for op acceptance: `filter` on SOptional<SList<T>>
  is accepted. The potential absence is surfaced by the explain
  runtime-rejection analyzer, not by the acceptance predicate.
  Helpers in pipe_ops.dart (_acceptsList, _acceptsMap, etc.) all
  unwrap via a shared _unwrap helper.

Root-requirement semantics (output formats)
  MustBeMap / MustBeList / MustBeFlatList do NOT unwrap. An optional
  at the root means "value may be absent entirely"; TOML/HCL/CSV
  cannot serialize an absence. Users must materialize a default
  before the --to step. The distinction between op-acceptance and
  root-requirement is deliberate: ops tolerate runtime null
  propagation, root serializers don't.

Inference propagation
  Field access on SMap with an optional field yields SOptional<T>.
  Field access on SOptional<SMap<...>> (null propagation) also
  yields SOptional<T>. The factory collapses nested cases so the
  result is never SOptional<SOptional<X>>.

Analyzer integration
  - Empty-filter check unwraps optional bool predicates; an optional
    bool may be true, so not "provably non-boolean."
  - Missing-field path check walks through optional wrappers to
    inspect the underlying SMap fields.
  - Runtime-rejection check does NOT unwrap: optional counts as a
    potential mismatch worth warning about.

Completer integration
  Tab completion unwraps optional for field enumeration and inner-
  expression resolution. An optional list still completes against
  its element shape in `.users | map(<TAB>)` contexts.

Serialization
  renderShape: optional<inner>.
  shapeToJson: {kind: optional, inner: ...}.

Tests (+16 across two files)
  shape_test.dart: render, serialize, equality, nested collapse,
    embedding in other shapes.
  shape_explain_test.dart: field propagation, access through
    optional wrapper, op acceptance, missing-field walk, optional
    bool predicate, root rejection by TOML, nested collapse via
    inference, JSON round-trip.

Quality gates: dart analyze clean, 1338 tests pass (was 1325,
+13 new including the 16 above minus overlap with existing tests),
dart format clean, pana 160/160. Zero test regressions.

This is step 1 of the track A implementation plan in
doc/schema-design.md. Next: JSON Schema subset parser.
Add parseJsonSchema(String): Shape in lib/src/schema/parser.dart.
Walks the JsonValue output of rumil_parsers' parseJson, mapping
four keywords onto the shape ADT:

- type: string selects the target kind
- properties: nested field schemas for "object"
- items: element schema for "array"
- required: which properties stay concrete vs become SOptional

Rejected keywords (23 total) each produce a targeted error with a
JSON path pointing at the site. Rejections cover value-level
constraints (minimum/maximum/pattern/enum/format/minLength/maxLength
/minItems/maxItems/uniqueItems/const/multipleOf), structural
combinators (allOf/oneOf/anyOf/not), conditionals (if/then/else
/dependencies/dependentRequired/dependentSchemas), references
($ref/$defs/definitions), and extra object constraints
(additionalProperties/patternProperties/propertyNames).

Unknown keywords are tolerated per JSON Schema's extensibility
convention — $schema, $id, title, description all pass through as
ignored metadata.

Error diagnostics carry a JSON path ($.properties.a.properties.b)
so users can find the offending nested schema without scanning the
whole file.

Tests (41 new):
- 5 scalar types round-trip (null, bool, number, integer→number,
  string).
- 3 array variants (no items, scalar items, object items).
- 5 object + required combinations (empty, all required, no
  required, partial required, nested object with own required).
- 18 rejection tests (one per keyword class).
- 2 metadata-tolerance tests.
- 7 error-diagnostic tests (invalid JSON, non-object root, missing
  type, unsupported type, properties type error, required type
  error, nested error with path).
- 2 realistic round-trip scenarios (user record, list of records).

Exported parseJsonSchema from package:lambe/lambe.dart.

Quality gates: dart analyze clean, 1379 tests pass (was 1338, +41),
dart format clean, pana 160/160.

Step 2 of 9 in doc/schema-design.md's implementation plan. Next:
loader (file IO + sibling auto-detect) and mergeSchemaWithData
(disagreement-is-error semantics).
Add lib/src/schema/loader.dart with three functions:

loadSchemaFromFile(path)
  Reads and parses a schema file. QueryError on missing file or
  parser rejection.

loadSchemaForData({explicitSchemaPath, dataPath})
  Explicit path wins. Otherwise auto-detects a <dataPath>.schema.json
  sibling. Returns null when neither exists. Handles extension
  rewriting (data.json -> data.schema.json, events.ndjson ->
  events.schema.json).

mergeSchemaWithData(schema, data)
  Schema-augments-data merge per doc/schema-design.md:

  - SAny on either side: the other wins.
  - SOptional + present data: strip optional, merge inners (field is
    concretely there for this run).
  - SOptional + absent data: keep optional (field may be absent in
    other runs).
  - SOptional + null data: keep optional (Lambe-style null
    propagation: null ~ absent).
  - Schema-only fields: preserved.
  - Data-only fields: preserved (schema is a partial description).
  - Lists and maps recurse.
  - Concrete-type disagreement at any path: QueryError naming path
    ($.user.age, $[*]).

  Path format uses JSON Path-ish notation: $ for root, .field for
  map descent, [*] for list element.

  Same rule throughout: agreement passes, schema fills in gaps, data
  fills in extras, concrete disagreement is an error. Keeps --explain
  honest in the schema-agrees-with-data case and loud in the
  schema-contradicts-data case.

Null-data policy
  The stance "schema optional + data null keeps optional" is a
  deliberate choice: JSON Schema users commonly use null for absent
  fields, and Lambe's null-propagation semantics treat null similarly
  to absent. Being strict here would produce friction with real-world
  JSON Schemas. Documented in the test that pins this behavior.

Tests (+25)
  - 3 loadSchemaFromFile tests (success, missing file, parser error
    propagation).
  - 5 sibling auto-detect tests (no sibling, with sibling, .ndjson
    extension, explicit beats sibling, explicit only).
  - 3 agreement tests (equal scalars, SAny on either side, both
    SAny).
  - 5 disagreement tests (scalar vs scalar with path, map vs
    non-map, list vs non-list, nested path, list element path).
  - 4 SOptional handling tests (present strips, absent keeps, null
    keeps, disagreement on inner).
  - 5 augmentation tests (schema-only field, data-only field, empty
    list uses schema element, non-empty merges element, recursive
    merge).

Exported loadSchemaFromFile, loadSchemaForData, mergeSchemaWithData
from package:lambe/lambe.dart.

Quality gates: dart analyze clean, 1404 tests pass (was 1379, +25),
dart format clean, pana 160/160.

Step 3 of 9 in doc/schema-design.md. Next: CLI wiring with the
--schema rename.
Add renderJsonSchema(Shape, {pretty}): String in
lib/src/schema/renderer.dart. Walks the shape ADT and emits a JSON
Schema subset document that parseJsonSchema accepts.

Main decisions:

SOptional handling
  SOptional inside SMap becomes a non-required property: the inner
  shape goes into `properties`, and the field name is omitted from
  `required`. This is JSON Schema's standard way to express "this
  field may be absent," and it's the only position where Lambe can
  round-trip optionality.

  SOptional elsewhere (top-level, inside SList, etc.) has no
  standard JSON Schema spelling in our subset. Renderer flattens to
  the inner shape — it's a one-way drop for these positions. The
  round-trip is preserved for every shape the parser can produce,
  which is the only invariant we promise.

SAny handling
  Renders as the empty object {}. Parser treats an empty object as
  SAny (the "empty schema accepts anything" JSON Schema convention).
  Round-trip preserved. Added to parser: an empty object with no
  `type` is now SAny instead of a "missing type" error.

Pretty vs compact
  Default `pretty: true` emits 2-space-indented JSON for human
  reading (print-shape output). `pretty: false` for embedding in
  other JSON payloads (future MCP responses).

Round-trip invariant
  parseJsonSchema(renderJsonSchema(s)) == s for every shape the
  parser can emit. 12 representative cases pin this in the test
  file, plus two complex-shape tests (optional field in a nested
  list, four-deep nested maps).

Tests (+32)
  - 5 scalar renderings.
  - 4 container renderings (list with items, list of any, map all
    required, map no required, empty map).
  - 1 mixed-required round-trip.
  - 3 SOptional positions (top, inside list, inside map).
  - 3 pretty/compact checks.
  - 12 explicit round-trip cases covering every parser-reachable
    shape plus 3 complex scenarios.

Exported renderJsonSchema from package:lambe/lambe.dart.

Quality gates: dart analyze clean, 1436 tests pass (was 1404, +32),
dart format clean, pana 160/160.

Step 4 of 9. Next: CLI wiring — rename --schema to --print-shape,
add --schema <path> option, thread through evaluation and explain.
First user-visible breaking change in 0.9.0: rename the existing
--schema flag to --print-shape, add a new --schema <path> option
that takes a JSON Schema file.

--schema <path>
  New option on `lam`. Threads the declared shape through both
  --explain inference (via mergeSchemaWithData) and normal evaluation
  (validation-as-side-effect — a concrete-type disagreement between
  schema and data errors at load time).

  Auto-detection: if --schema is omitted and a sibling
  <datafile>.schema.json exists, it's used implicitly. Same
  convention as 0.9.0's .ndjson auto-detect.

--print-shape
  Replaces the 0.8.0 --schema flag. Emits the inferred shape as a
  JSON Schema subset document, round-trippable with --schema input.

  Output format is now JSON Schema (second breaking change): 0.8.0's
  type-name-string JSON is replaced with the canonical schema form
  so that `lam --print-shape data.json > data.schema.json` followed
  by `lam --schema data.schema.json ...` round-trips cleanly.

Mode combination guards
  --print-shape + --schema is rejected: --print-shape prints the
  inferred shape from data, which a schema would only second-guess.
  --ndjson + --schema is rejected (added to existing ndjson guards).
  --ndjson + --print-shape is rejected.

Help text updates documented in doc/lam.1.md; regenerated doc/lam.1.

CLI flow (when --schema is active):
  --explain path: shape = mergeSchemaWithData(schema, shapeOf(data))
                  (or just schema when data is absent);
                  fed to explain() as inputShape.
  Normal eval: mergeSchemaWithData is invoked purely for its
               side-effect validation (throws on disagreement).
               Evaluation runs on raw data as usual.
  --print-shape: schema is rejected (see above).

Smoke-tested end to end with:
  * --print-shape emits JSON Schema (verified by eye).
  * --explain with sibling .schema.json auto-loads, surfaces
    SOptional from the `required` semantics, shows it in the shape
    trace ("list<map<name: string, age: number, email: optional<string>>>").
  * --schema api.json '.' response.json where schema says age:string
    but data has age:number errors cleanly with
    "schema disagreement at $[*].age: schema says string, data is number"
    and exits 1.

Existing legacy inferSchema function stays referenced in REPL and
MCP (updated in steps 6 and 7).

Quality gates: dart analyze clean, 1436 tests pass (no changes; no
new tests yet — step 8 adds CLI integration coverage), dart format
clean, pana 160/160, manpage round-trip matches.

Step 5 of 9. Next: REPL integration.
Self-review of steps 1-5 caught two honesty gaps:

renderJsonSchema: lossy positions documented
  The round-trip invariant holds only for shapes parseJsonSchema can
  produce (SOptional inside SMap fields). Callers composing shapes
  outside that path — e.g., an inference result where SOptional
  lands at the root or inside a list — hit a silent flatten. The
  previous docstring said "no standard JSON Schema representation"
  which was true but terse; now explicit that optionality is
  **dropped** in those positions, so the user knows the output isn't
  lossless for arbitrary shapes.

inferSchema: deprecated for 1.0 removal
  inferSchema emits type-names-as-strings (e.g. `{"age": "number"}`),
  a format that doesn't round-trip with any parser we ship. With
  renderJsonSchema as the canonical JSON Schema emitter and shapeOf
  for the Shape ADT, inferSchema is vestigial. Marked @deprecated
  with a migration pointer to renderJsonSchema(shapeOf(value)).
  Removal scheduled for 1.0 per the "freeze the shape API" target.
  REPL and MCP callsites migrate in steps 6 and 7.

Also verified via exploratory tests (not committed, cleanup-only):

- SOptional(SOptional(x)) collapses at the factory level AND
  through _lookupField's recursion-then-factory-wrap, so stacked
  optionals cannot exist from inference.
- mergeSchemaWithData never produces stacked optionals either: the
  data-side optional branch unwraps before merging inners.

Other self-review findings deferred:

- CLI guard matrix (7 mode-combo rejections) is accreting. Noted in
  project_lambe_cli_test_matrix memory as a post-4-tracks refactor.
- Validation errors aren't structured like OutputShapeError.
  Deliberately not forcing them into that mold; they're a
  different class of problem (input validation vs output
  serialization).
- CLI integration tests for --schema / --print-shape deferred to
  step 8.

Quality gates: dart analyze clean, 1436 tests still pass, dart
format clean, pana 160/160.
Migrate REPL's :schema command from inferSchema-based output to the
0.9.0 schema infrastructure, and add :print-shape.

Session state
  New `Shape? activeSchema` variable in runRepl. Loaded by :schema
  <path>, queried by :schema (no arg), used to validate future data
  loads.

:schema [path]
  With a path: loads the schema via loadSchemaFromFile, stores it on
  the session. If data is currently loaded, runs
  mergeSchemaWithData on the fly as a structural validation check;
  reports "Schema loaded (agrees with current data)" or "Schema
  loaded, but disagrees with current data: <path>: ...".
  No path: prints the active schema via renderJsonSchema, or the
  no-schema-loaded message.

:print-shape
  New command. Prints shapeOf(currentData) as JSON Schema. The REPL
  analog of the CLI --print-shape; replaces the old :schema (no
  arg) behavior.

:load <file> re-validates against the active schema
  When a schema is loaded and the user switches data via :load,
  runs mergeSchemaWithData again and warns on disagreement. Keeps
  the REPL session honest across data changes.

Completer
  Added flatten-cells and print-shape to the _replCommands list in
  completer.dart so tab completion on bare `:` offers the new
  commands alongside the old ones. 11 total now (was 9).

:help updated to document both :schema forms and :print-shape.

inferSchema callsite removed from REPL. The legacy function stays
in lib/src/output.dart as @deprecated; MCP migrates in step 7.

Manual REPL verification (interactive, can't be automated without a
TTY seam):
  * :print-shape emits JSON Schema for the data.
  * :schema <path> loads, reports agreement / disagreement vs data.
  * :schema (no arg) prints the active schema.
  * :load <file> re-validates against the active schema.
  * :help lists the new commands.
  * Tab completion on bare `:` offers all 11 commands.

Test update: completer_test.dart "all commands on bare colon"
updated from expecting 9 to expecting 11, plus explicit checks for
flatten-cells and print-shape.

Quality gates: dart analyze clean, 1436 tests pass, dart format
clean, pana 160/160.

Step 6 of 9. Next: MCP server.
Three MCP surface changes aligning with the CLI schema work.

lambe_query: new schema parameter
  Optional inline JSON Schema string. When provided, data is parsed
  and validated against the schema before the query runs; a
  structural disagreement returns an error with the path. Agents
  wanting to fail-fast on unexpected shapes now have a first-class
  way to do it. Threaded through _handleQuery via parseJsonSchema +
  mergeSchemaWithData.

lambe_schema renamed to lambe_print_shape
  Tool rename aligning with the CLI rename (--schema -> --print-shape).
  Output format changed from type-name-string JSON (e.g.
  `{"age": "number"}`) to canonical JSON Schema (e.g.
  `{"type": "object", "properties": {"age": {"type": "number"}}, ...}`).
  The new output round-trips with lambe_query's schema parameter,
  lambe_check, and the parseJsonSchema library function. This is a
  breaking change for agents that hardcoded the old tool name; the
  description calls it out explicitly.

lambe_check: new tool
  Validates data against a JSON Schema subset without running a
  query. Returns `{"ok": true}` on agreement or
  `{"ok": false, "error": "..."}` with the disagreement path.
  Intended for API-contract checks, CI gates, and agents that want
  to verify fixtures before running queries.

Server instructions updated
  The initial MCP instructions string now lists all four tools by
  name with one-line descriptions of when to use each. Helps agents
  pick the right tool without having to call tools/list.

AGENTS.md updated
  Tool list in the top-level agent guide mirrors the new surface.

Smoke-tested end-to-end via JSON-RPC:
  - tools/list returns [lambe_query, lambe_print_shape, lambe_check,
    lambe_assert].
  - lambe_print_shape on a users object emits valid JSON Schema with
    required set from the data's concrete keys.
  - lambe_check with matching schema returns {"ok": true}.
  - lambe_check with mismatched schema returns
    {"ok": false, "error": "schema disagreement at $.age: ..."}.
  - lambe_query with a schema parameter that disagrees with data
    returns isError=true before running the query.

inferSchema is no longer referenced from bin/mcp_server.dart. The
legacy function remains in lib/src/output.dart marked @deprecated;
all repo callsites have now migrated.

Quality gates: dart analyze clean, 1436 tests pass, dart format
clean, pana 160/160.

Step 7 of 9. Next: CLI integration tests for --schema / --print-shape.
Nine new end-to-end tests in test/cli_integration_test.dart pin the
schema surface at the CLI layer. Each spawns `dart bin/lam.dart` and
asserts on exit code, stdout, stderr.

--print-shape (3 tests)
  1. Emits valid JSON Schema for a typical object (parses as JSON,
     carries type/properties/required).
  2. Round-trip: print-shape data.json > data.schema.json, then
     --schema data.schema.json '.' data.json succeeds. Proves the
     renderer + parser agree end-to-end via a real subprocess,
     closing the loop the library-level round-trip tests opened.
  3. --print-shape + --schema is rejected (redundant combination).

--schema (6 tests)
  4. Explicit --schema threads into --explain inputShape; the shape
     trace surfaces schema-declared optional fields (email:
     optional<string>) that don't exist in data.
  5. Sibling <data>.schema.json is auto-detected when --schema is
     omitted. Verifies the same schema information flows through.
  6. Schema disagreement (data.age is number, schema says string)
     exits 1 with a path-annotated stderr message
     ("$.age", "string", "number" all present).
  7. Schema parse error on rejected keyword (allOf) surfaces a
     clear diagnostic (contains "allOf" and "unsupported").
  8. Missing schema file exits 1 with "schema file not found".
  9. --ndjson + --schema is rejected.

These exercise the full wiring added in step 5 (CLI) plus the
parser/loader/renderer library layer from steps 2-4. Library tests
stay the foundation; integration tests here pin the glue.

Quality gates: dart analyze clean, 1445 tests pass (was 1436, +9),
dart format clean, pana 160/160.

Step 8 of 9 in doc/schema-design.md. Next: docs polish — CHANGELOG
0.9.0 entry, README reframe, doc/schema.md user guide, man page
examples.
Self-review of the full 0.9.0 before the docs polish surfaced a real
gap: track B shipped --explain-json at the CLI but never surfaced
--explain to MCP agents. The positioning pitch ("shows you what
you're working with") specifically targets agents; leaving them
without structured explain output undermines the track B deliverable.

Framing this as "future" was reflexive, not reasoned. 40 lines of
tool wiring calling existing library functions is not a future
feature; it's an unfinished track.

lambe_explain tool
  Parameters:
    expression (required): the query to analyze.
    data (optional): when provided, shape seeds from shapeOf(data).
    format (optional): input format for data; auto-detected if not
      given.
    schema (optional inline string): merges with shapeOf(data) for a
      more precise initial shape. With no data and no schema, starts
      from SAny.
    include_trivial (optional bool): surfaces trivial-result
      warnings (--explain-trivial equivalent).
    flatten_cells (optional enum): affects the writable_as summary.

  Returns renderExplainJson(report) — the exact same payload the
  CLI's --explain-json emits, with snake_case keys and nested-kind
  shape trees. Agents get one structured contract across surfaces.

Updated the MCP server instructions to list all five tools.
Updated AGENTS.md tool inventory.

Smoke-tested end-to-end via JSON-RPC:
  - tools/list returns five tools including lambe_explain.
  - lambe_explain with data + expression returns a trace where
    .users shape is list<map<...>> and |map(.name) is list<string>.
  - lambe_explain with data + schema (schema declares email as
    optional) produces list<optional<string>> when .email is
    accessed — agent-advantage use case proven.
  - lambe_explain with include_trivial: true surfaces trivial_result
    warnings for sort_by(.missing).
  - lambe_explain with no data (expression-only) still produces a
    meaningful trace (length on unknown input infers SNum).

Existing library-level tests cover the underlying renderExplainJson
and explain functions; the new MCP tool is a thin wrapper. CLI
subprocess tests for MCP are consistently deferred across all
server tools.

Quality gates: dart analyze clean, 1445 tests pass, dart format
clean, pana 160/160.

REPL still lacks :explain. Leaving as genuine future work: REPL
users can already run queries live (sub-100ms), so the
"see-before-run" need is weaker there than it is for agents.

Clears the track-A step-9 prerequisite: MCP surface is now
coherently covered.
Ship the 0.9.0 documentation pass: reframe the pitch to match what
shipped, consolidate the scattered 0.9.0-dev CHANGELOG entries into
a single coherent release section, add a user-guide for schemas,
and fix stale references to the pre-rename CLI flags, deprecated
library symbols, and old MCP tool names.

pubspec.yaml
  - Version: 0.9.0 (regenerated lib/src/_version.dart).
  - Description reframed to the "shows you what you're working with"
    pitch, trimmed to fit pana's 180-char limit.
  - Added `schema` to topics.

CHANGELOG.md
  - New 0.9.0 section organized by theme, not by track. Opens with
    the shape-feedback-loop framing. Five sections: schemas as a
    first-class contract, SOptional in the shape ADT, richer
    --explain, --ndjson, --flatten-cells, cross-surface Hint type.
  - Breaking changes called out explicitly: --schema renamed to
    --print-shape; --print-shape output format changed; MCP tool
    lambe_schema renamed to lambe_print_shape; Shape gains
    SOptional variant; ExplainWarning gains required kind param.
  - Deprecated section notes inferSchema scheduled for 1.0 removal.

README.md
  - New lead: "a query language for structured data that shows you
    what you're working with." Drops the jq comparison from the
    pitch and names the actual use case ("when you don't already
    know the data").
  - New --schema section after --explain, showing both threaded-
    into-explain and validation-on-load examples, plus round-trip
    via --print-shape.
  - CLI examples: --schema and --print-shape replace the stale
    --schema data.json (which now means something different).
  - Library example: shapeOf/renderJsonSchema/parseJsonSchema/
    mergeSchemaWithData replace the deprecated inferSchema.
  - MCP tool list: five tools with their feedback-loop roles.
  - Docs index: added doc/schema.md.
  - REPL banner version bumped.

DESIGN.md
  - MCP tool list updated to five tools.

doc/schema.md (new)
  - Complete user guide for the schema feature: why-use, accepted
    keywords, rejected keywords, CLI/REPL/MCP/library surface,
    disagreement semantics, round-trip, what schemas don't do.
  - Clarifies the shapeOf-vs-schema division of labor.

doc/lam.1.md
  - Added schema-checked query and schema-seeded explain examples
    to the EXAMPLES section.
  - Regenerated doc/lam.1 via tool/manpage.dart.

AGENTS.md was already updated in step 7 (MCP).

Quality gates: dart analyze clean, 1445 tests pass, dart format
clean, pana 160/160 (description length was over 180 chars on
first pass; trimmed).

Completes track A. Release-ready from a code/docs perspective. What
remains outside track A: install.sh + Homebrew tap for 1.0, the
downstream rem/arda-web commits still unpushed, and the push of
the 0.9.0-dev branch itself.
Ship the one-line installer the 0.8.0 handover called out as "the
biggest single 1.0 ergonomic win." Users no longer need to know
their architecture, fetch three curl commands, or use sudo.

install.sh
  curl -fsSL https://raw.githubusercontent.com/hakimjonas/lambe/main/install.sh | sh

  Detects OS (Linux/macOS) and arch (x64/arm64). Resolves the latest
  release via the GitHub API (no auth, no JSON parser — grep+sed).
  Downloads lam and lam-mcp binaries into ~/.local/bin/. Verifies
  SHA256 against a published checksums.txt before installing;
  refuses to install on mismatch. Honors LAMBE_VERSION to pin a
  tag, LAMBE_PREFIX to change the install dir, LAMBE_BASE_URL to
  override the release base URL (useful for mirrors and testing),
  LAMBE_NO_MAN to skip the man page.

  Does NOT modify shell rc files. Prints a PATH reminder if the
  target bin dir isn't on PATH, showing the exact export line the
  user would add if they choose.

  Man page install is best-effort: if the release has a lam.1
  asset (current releases do not — placeholder for a future bump),
  it's installed to ~/.local/share/man/man1/. Silently skipped
  otherwise.

Release workflow: checksums.txt
  .github/workflows/release.yml now runs `sha256sum lam-* >
  checksums.txt` over the collected artifacts and uploads the
  manifest as a release asset. install.sh fetches this before any
  binary, and every binary is verified against it before install.

Smoke-tested end to end with a local python HTTP server and fake
artifacts:
  - Platform detection correctly identified linux-x64.
  - LAMBE_BASE_URL override worked (needed for the test).
  - checksums.txt parsed, expected hashes looked up per asset.
  - Correctly matched hashes: binaries installed with 0755 perms.
  - Corrupted lam-linux-x64 (hash mismatch): refused install,
    exited 1, wrote no files to the install prefix.
  - PATH reminder rendered correctly when target wasn't on PATH.

README: new Installation section leads with the one-liner, keeps
pub.dev / library / source-build options below for Dart users.
CHANGELOG: new "Install ergonomics" section under 0.9.0.

Still deferred: Homebrew tap (noted in handover, independent work,
can be added post-0.9.0 without breaking the install story).

Quality gates: dart analyze clean, 1445 tests pass, pana 160/160,
install.sh `sh -n` syntax check clean.
Full audit of 0.9.0 before release. Findings and fixes:

doc/lam.1.md frontmatter
  `source: Lambë 0.8.0` -> `0.9.0`. Not auto-generated; no CI check
  caught it. Regenerated doc/lam.1.

pubspec.yaml
  Stray blank line in the dev_dependencies section removed
  (cosmetic; pana had no opinion).

server.json + .github/workflows/release.yml
  MCP registry description was still the 0.8.0 "Query JSON, YAML,
  TOML, HCL, CSV, TSV, and Markdown" pitch. Updated both to the
  0.9.0 "A query language for structured data that shows you what
  you're working with" framing so the MCP registry entry matches
  pubspec and README. The workflow's hardcoded description in the
  publish-mcp step now also reflects 0.9.0.

tool/release_prep.sh (new)
  Scriptable release gate. Runs the full check matrix before
  tagging:
    * Version consistency (pubspec, _version.dart, man page
      frontmatter, CHANGELOG section, README banner).
    * File hygiene (nothing tracked that matches .gitignore patterns
      for secrets/benchmarks/session notes).
    * Dependencies (pubspec_overrides.yaml not tracked, dart pub get).
    * Quality gates (analyze, format, test, pana 160/160).
    * Documentation (doc/lam.1 synced with .md source, dart doc
      produces zero errors).
    * Release workflow (.yml present, all per-platform assets
      referenced, checksums.txt step present, server.json
      description matches pubspec).
    * Git state (clean working tree, tag doesn't exist yet, branch
      check).

  Exit 0 means ready to tag. Non-zero collects and reports all
  issues at once rather than failing on the first one — so you fix
  the whole list and re-run, not whack-a-mole.

  Usage: bash tool/release_prep.sh [version]

The script flagged the doc/lam.1.md frontmatter on first run — so
it's already paying for itself. The README banner check initially
had a shell-word-splitting bug (grep output tokenized by whitespace
meant `lambe` and `v0.9.0` became separate tokens); fixed with a
while-read loop over a here-doc.

What the script does NOT do:
  * Tag, push, or publish — those stay manual. This is the
    "am I ready?" audit, not the release itself.
  * Verify install.sh against a live release. Checked manually
    against a staged HTTP server during install.sh development;
    post-tag verification with LAMBE_VERSION=v0.9.0 is noted in
    the "Next steps" output.

Post-audit state: dart analyze clean, 1445 tests pass, dart format
clean, pana 160/160, man page round-trip matches. Ready to tag
after the remaining uncommitted state (this commit) lands.
…gn.md

Cleanup pass before pushing 0.9.0 to make sure the public repo
state is free of internal-development-only content.

.gitignore: add the local AI-tool session cache directory
  Mirrors how .idea/ and .vscode/ are already ignored — local
  tooling state belongs with the checkout, not the public repo.

doc/schema-design.md: reframe as rationale, not internal plan
  The file was written as a track-A design doc in plan mode,
  using internal vocabulary ("Track A", "approved, ready for
  implementation"). That framing is meaningful mid-release
  but noise to a public reader: "Track A" is not documented
  anywhere users would see.

  Retitled as "Schema-typed queries — design rationale" with a
  pointer to doc/schema.md for user-facing content. Removed the
  "Tracks B/C/D" reference from the Context section. The file's
  value — a record of why JSON Schema subset was chosen over a
  Lambe DSL, why SOptional was added, why disagreement-is-error
  rather than schema-wins — is preserved.

Audit confirmed nothing else tracked reads as internal dev content:
  * AGENTS.md, AI.md, DESIGN.md, ROADMAP.md — all public by intent.
  * No HANDOVER_*.md tracked (commit 613803e removed it; .gitignore
    prevents re-adding).
  * No bench-results-*.json tracked (.gitignore + .pubignore both
    catch them).
  * No secrets (.mcpregistry_* ignored).
  * No stale binaries (lam-mcp ignored).
  * No local dep overrides (pubspec_overrides.yaml ignored).

The 0.8.0 handover plan is still in git history (commit 93271aa,
removed in 613803e). Not cleaning history: the content is planning
notes from a committed-then-removed workflow, not secrets, and
rewrite would break existing clones. The removal commit itself
documents the intent going forward.
Three new language features for Lambé queries:

1. **List literals** (`[expr, expr, ...]`)
   - New `ListConstruct` AST node holding a `List<LamExpr>` of member
     expressions, evaluated against the current context.
   - Parsed at atom level so it never shadows postfix indexing
     (`expr[i]`, which requires a prior atom on the left of `[`).
   - Plus list concatenation: `+` on two lists produces concatenation.
     Mixed list/scalar `+` is a type error (Lambé strictness over
     silent lifting); evaluator wrapper `_binaryOp` intercepts before
     delegating to `applyBinaryOp` for the scalar dispatcher.

2. **`//` alternative operator** (jq-style fallback)
   - New `Alternative` AST node. `a // b` returns `a`'s value if
     non-null, else `b`'s. `b` is only evaluated on fallback.
   - Lambé semantics differ from jq deliberately: jq fires on "null
     or false"; Lambé fires only on `null`. Genuine `false`/`0`/`""`
     pass through.
   - Right-associative; one level above `||` so `a // b // c` means
     `a // (b // c)`. Built by hand because Lambé's parser
     combinators ship `chainl1` (left-associative) only.
   - Doubles as missing-key fallback via null-propagation:
     `.user.email // .user.contact.email // "unknown"`.
   - The `/` binary op gets a `notFollowedBy('/')` guard so it
     doesn't shadow `//`.

3. **Keyword aliases for binary operators** (`and`/`or`/`tonumber`)
   - `and` parses as `&&`, `or` as `||`. Both keep word-boundary
     semantics so `.andy` and `.orbit` still tokenize as fields.
     The result `BinaryOp` node carries the canonical symbol so
     shape/eval don't see the alias.
   - `tonumber` parses as the canonical `to_number` pipe op.
     Registered as a jq-ism alias at the parser layer; shape and
     evaluator stay alias-unaware.

Shape inference (`shape/infer.dart`) and rendering (`shape/explain.dart`)
updated for both new AST nodes.

All 1,496+ tests pass: 7 new tests for `//` (eval + parser), 5 for
list literals (parser), 5 for list literals (eval), 1 for `+` list
concat, 6 for jq-ism aliases.
Recognises common jq idioms that Lambé does not support and surfaces
a targeted hint instead of the generic "expected ..." fallback. Keeps
error messages short and actionable for agents trained on jq priors.

Recognised idioms:

- `.users[]` (jq array iteration) → hint to use `map(...)` for
  per-element work.
- `.foo?` (jq error suppression) → hint to use `has()` or a shape
  check.
- `..` (jq recursive descent) → hint to use explicit paths.
- `| select(pred)` (jq filter) → hint to use `filter(...)`.
- `map(select(...))` (jq filter idiom) → hint to use `filter(...)`.
- `| empty` (jq drop stage) → hint to use `filter(...)` for the
  intended drop semantics.
- `if/then/else/end with empty` (jq conditional drop) → same
  filter-based hint.
- `| if ... then ... else ... end` (jq if-as-pipe-stage) → explains
  Lambé's expression-only `if/then/else` rule.

Two integration points in `lib/lambe.dart`:

- `_jqIdiomHint(expression, offset)` — pattern-matches the input and
  returns a `String?` hint. Wired into `_formatParseErrors` before
  the verbose "expected ..." fallback, and into `_describeLeftover`
  for unparsed-remainder context.
- `_jqPipeOpHint(word)` — fires when the user writes `.x | <jq-name>`
  for a name Lambé doesn't have, mapping to the Lambé equivalent.

Plain typos still fall through to the existing did-you-mean
(closest-match) suggestion. The hint short-circuits only when the
jq idiom is recognised.

10 tests in `test/parse_error_format_test.dart` cover each idiom,
including the falls-through case where did-you-mean still fires.
Replace the six layered chainl1 calls plus the recursive _unary
definition with a single pratt<LamExpr>(_postfix, [...]) call covering
prefix unary -/!, the six binary precedence levels, and the
right-associative // alternative. The if/then/else conditional stays
inside _atom rather than as a Pratt operator because its three-branch
shape doesn't fit infix dispatch.

Binding powers (low to high):
  // alternative (right-assoc)  5
  ||, or                       10
  &&, and                      20
  ==, !=                       30
  <=, >=, <, >                 40
  +, -                         50
  *, /, %                      60
  prefix -, !                  70

The / operator keeps its notFollowedBy(/) guard so it doesn't shadow
the // alternative; keyword aliases and / or use _kw(...) (word
boundary) so .andy / .orbit don't tokenize as 'and y' / 'or bit'.

Bench numbers (tool/bench/run.dart --aot --runs 5, completer scenarios
across 5 shapes x 4 sizes):

  vs rumil 0.6 + chainl1 baseline   mean +7.1%, median +5.1%
  vs rumil 0.7 + chainl1 (just rumil bump)
  vs rumil 0.7 + Pratt (this commit)  mean -10.1%, median -8.7%

  Net change for Lambé queries     ~17% faster on the completer hot
                                   path vs the chainl1 baseline that
                                   shipped with rumil 0.6.

The win comes from collapsing six chainl1 dispatch layers into one
Pratt loop plus eliminating the defer(() => _unary) recursion via the
explicit Prefix descriptor. The opTable fast path is not engaged here
because operators are wrapped with _sym/_kw; the gain is structural.

All 1,496 lambe tests pass unchanged. No public API change
(parseQuery / parsePartial signatures untouched).
Replaces the inline 13-operator infix ladder with a single call to
rumil's new cFamilyPrecedence preset. Lambé-specific operators stay
inline:

- The right-associative `//` alternative (no C-family analogue)
- Keyword aliases `and` / `or` for `&&` / `||` (Lambé extension)
- The `/` notFollowedBy(/) guard, supplied via sym dispatch

Functionally equivalent to the previous hand-rolled list; bench
numbers within noise of pre-preset (mean -9.7% vs -10.1% on the
completer matrix). All 1,496 lambe tests pass unchanged.
rumil 0.7.0 (and rumil_parsers, rumil_expressions) published to
pub.dev. Lambé now resolves these from pub.dev directly rather than
via the local-path override that carried us through the 0.7
development cycle.

Constraints:
- rumil: ^0.6.0 -> ^0.7.0
- rumil_parsers: ^0.6.0 -> ^0.7.0
- rumil_expressions: ^0.6.0 -> ^0.7.0

pubspec_overrides.yaml is removed (it's gitignored, so this is a
local-file deletion only). Future contributors clone and `dart pub
get` resolves real published packages.

All 1,496 lambe tests pass against the published rumil 0.7 family.
Mirrors the same convention added to rumil-dart's .gitignore. Lets
release-planning notes, status snapshots, and similar working-memory
documents live in the repo for discoverability without ever getting
committed.
…A followups

Bundles steps 1–8 of LAMBE_0.9.0_PLAN with Tier A items A1–A7. The
27 per-op AST classes collapse into a single BuiltinPipeOp(name, args)
backed by an extended pipe_ops.dart spec table that owns acceptance,
shape inference, runtime evaluation, and parse arity on one record.
Adding or renaming a pipe op is now a one-file change. As(target)
keeps a dedicated AST class for its typed OutputFormat argument.

REPL highlighter migrated from a 100-line hand-rolled tokenizer to a
rumil_tokens LangGrammar defined in lib/src/highlight_grammar.dart.
New runtime dependency: rumil_tokens ^0.1.0.

Other 0.9.0 wins:
- _normalize short-circuits canonical inputs (identity-pass for
  Map<String,Object?> / List<Object?> / scalars)
- queryNdjsonString(lines, expression) convenience added
- Six doc-precision fixes inlined into pipe_ops.dart and
  evaluator.dart (// is null-fallback; empty-list policy; unique
  distinguishes int/double; duplicate-key behaviour; from_entries
  rejects non-map / non-string-key entries explicitly; type rejects
  non-JSON values with a hint)
- inferSchema @deprecated annotation already in place

Tier A followups from the discovery session:
- TSV input honors header rows the same way CSV does (input.dart
  now runs detectDialect with the tab delimiter forced)
- String single-char indexing: .name[0] returns "a" instead of
  erroring (mirrors slice semantics; out-of-range returns null)
- jq alias: add → sum
- Stale // line removed from _jqIdiomHint doc
- doc/getting-started.md pubspec snippet bumped to ^0.9.0
- doc/syntax.md bare-literal examples rewritten as runnable
  echo/lam invocations (every rewritten example verified)
- CHANGELOG appended for both batches

1516 tests pass (1500 baseline + 16 new). pana 160/160. dart analyze
clean (one pre-existing test warning at evaluator_test.dart:646).
The `lam` AOT binary built into the repo root was tracked as untracked.
Now matches the `lam-mcp` entry just below.

The to_entries example in doc/syntax.md showed compact single-line
output (`-> [{"key": ...}]`) but real `lam` defaults to pretty-printed
JSON. Rewrote the example as a runnable echo/lam invocation matching
the real output, consistent with the Tier A doc rewrites.

Implementer surfaced both as out-of-scope-but-flagged after Tier A
landed.
`lam --print-shape '.users' data.json` now prints the schema of the
result of evaluating `.users` rather than the schema of the whole
document. Pre-0.9.0 the expression was silently ignored. Composes
with the existing inferShape / renderJsonSchema machinery and
mirrors --explain's no-data fallback (infer against SAny when no
data is available).

The single-positional case is disambiguated by file existence: if
rest[0] is an existing file, treat it as the file (legacy form);
otherwise treat it as an expression. Plain identifier filenames
aren't valid lambé queries either, so the collision case is
vanishingly unlikely.

Empty stdin in static-analysis modes (--explain, --print-shape) is
now treated as "no data" rather than triggering a JSON parse error.

Tests: 4 new cases pinning compose, legacy, no-data, and null-result
behaviours. 1516 -> 1520 tests pass.
hakimjonas added 25 commits May 22, 2026 09:07
`{name: .x}` was the only spelling for object construction; `{"name":
.x}` errored with a confusing "unexpected" message. Lambé's data model
accepts any string as a key — the construction grammar should match.

Both spellings now produce identical maps. Bare identifiers stay the
canonical form for keys that are valid identifiers; JSON-string
literals are the way to construct keys that are not (`{"x-axis": .a}`,
`{"Content-Type": "application/json"}`, `{"my key": 1}`). Mirrors
`_stringLit` minus interpolation: `\(...)` in key position is
rejected with a clear message because key position is structurally
not an expression position. Shorthand `{name}` continues to require a
bare identifier — `{"name"}` alone would conflict with treating the
JSON-string as a value-with-defaulted-key, which we don't support.

Adds C6a regression test pinning `{name, tags: ["x", "y"]}` parses
correctly (discovery 4.1 reported broken on 0.8.0; works post-Pratt
migration; this guards against future breakage). Adds full C6b test
coverage: AST equivalence with bare form, hyphenated keys, keys with
spaces, mixed forms, escapes inside JSON-string keys, interpolation
rejection, end-to-end query roundtrip.
`-r` / `--raw` is a silent no-op on structured output (objects, arrays,
numbers, booleans, null) — only top-level string scalars get unquoted.
The previous wording ("Output strings without quotes") read as a
pretty-print toggle and surprised users on non-string values. Rebuild
of `doc/lam.1` follows.
CHANGELOG additions span the Tier C surface: HCL N=1 uniformity (Bug
fixes), markdown `text` op (new Markdown text extraction subsection),
JSON-string keys in object construction (Bug fixes), -r raw semantics
and the new non-goals page (Documentation precision), and the load-
bearing precedent comment for the `text` op.

`tool/bench/cli_bench.sh` is the C4 fact-finding harness — three
cases drawn from the discovery report (50k --print-shape, filter +
length, group_by) on synthetic inputs, AOT binary, min/median/max
across N runs. The user runs it on their workstation; cherry-pick
wins land as separate commits with measured numbers in the message.
`rumil_parsers 0.8.0` ships the JSON AST split (`JsonNumber` →
`JsonInt | JsonDouble`) along with the HCL decoder fix originally
adopted under the local 0.7.1 override. Lambé's constraint moves from
`^0.7.0` to `^0.8.0`.

Single consumer: `lib/src/schema/parser.dart`'s `_kindOf` switch case
maps both `JsonInt()` and `JsonDouble()` to `'number'` — preserves the
JSON Schema `type: number` semantics where lambé's schema layer reads
the JSON AST directly, while letting downstream type-flow analysis
specialize when it cares about the discrimination.

No user-visible behavior change at the lambé surface. Tests: 1639 pass
unchanged (the bump touches a single switch case that had no
behavior-level dependency on the flattened representation).
End-to-end CLI is 3.3× faster on parse-bound workloads, measured
against the discovery report's 0.8.0 baselines on a Linux x86_64
workstation. Most of the win is inherited from rumil 0.7's combinator
work; rumil_parsers 0.8.0's JSON AST split and capture-based parsing
contribute ~11% on the parse-bound cases and ~13% on group_by.
Each invariant runs as a separate `lam --assert` and reports the
failing invariant by name; failures don't short-circuit, so all
problems surface in one run. Picks up `./lam` if compiled, falls
back to `dart run bin/lam.dart`.

Invariants:
- at least one H2 release entry exists
- no duplicate H2 release entries
- the first heading is an H2
- the latest H2 matches `pubspec.yaml`'s version
New `lint-changelog` job compiles the AOT `lam` binary and runs the
script. Reuses `dart-lang/setup-dart` like the existing jobs.
A new subsection under Markdown shows extracting every release
version, the latest version, every subsection title, and the
no-duplicate-H2 invariant. Closes with a pointer to
`tool/lint_changelog.sh` as the in-repo example of these queries
gated by `--assert` in CI.
Adds a Tooling subsection mentioning `tool/lint_changelog.sh` and the
four invariants it gates.
`expect(query('[]', {}), [])` had no element type for the actual list,
which dart analyze rightly flagged as inference_failure_on_collection_literal.
Annotated as `<Object?>[]` to match what `query` returns.

This was the last analyzer warning the project carried; lambé is now
analyze-clean across the board.
…output

doc/syntax.md was carrying compact-JSON output drift in 28 examples
plus a few outright lies (commentary-form output like "[Alice (25),
Bob (35), Carol (42)]" presented as if it were lam's output, the
"32" arithmetic result that's actually "32.0", "expected bool" that's
"expected boolean"). The pre-Tier-A `query → -> result` form was
itself a teaching abstraction that diverged from real CLI behavior.

Now every example is a `$ lam ...` invocation with the output captured
by actually running it. Examples that don't reference data use `lam
-n`; examples that do reference data use `data.json` (declared at the
top of the doc as a save-this-file block). Copy-paste works
end-to-end.

Doc grew from 599 to 737 lines (+138, ~23%). The growth is from
multi-line pretty-printed output that matches what users see; the
compact `-> [...]` form was hiding that cost from readers.
… follow-on

Re-ran tool/bench/cli_bench.sh against the new rumil_parsers (HCL AST
split + common.dart capture rewrites + YAML overflow fix on top of the
JSON pass). Numbers shifted modestly in the right direction:

- --print-shape big.json: 744 ms → 732 ms
- filter | length (50k):  747 ms → 742 ms
- group_by (1k records):   34 ms →  33 ms

The HCL fold-in doesn't directly help JSON workloads but the
common.dart precision/capture rewrites contribute marginal gains
through cleaner shared-helper paths. Total speedup vs 0.8.0 stays at
~3.3× — the headline is unchanged; the tail just got a touch tighter.
…roke

Two highlighter bugs surfaced during the 0.9.0 live REPL smoke test:

1. Pipe op names (`filter`, `map`, `text`, etc.) rendered uncoloured
   because `lambeGrammar` only listed the language keywords (`if`,
   `then`, `else`, `true`, `false`, `null`, `and`, `or`). The
   tokenizer correctly classified `filter` as a plain identifier;
   the highlighter had no rule to colour it.

   Fixed by routing pipe op names through `LangGrammar.types` (the
   semantically appropriate field — they're not language keywords,
   they're language-defined identifiers from the user's perspective)
   and adding a `TypeName() => _hMagenta` case in `_colorFor`. The
   list is sourced from `pipe_ops.dart`'s spec table so adding an
   op picks up colouring automatically.

   `lambeGrammar` is now `final` instead of `const` because
   `pipeOpNames` is built at runtime from the spec table. The const
   was incidental; nothing depended on it.

2. Forward-typing skipped re-tokenisation: appending a character at
   end-of-line took a fast path (`stdout.writeCharCode`) that wrote
   the new character verbatim without re-running the tokenizer over
   the buffer. Result: typing `filter` left it plain until a
   subsequent edit triggered a full redraw. The fast path made sense
   when the highlighter was hand-rolled and per-keystroke
   tokenisation was expensive; with `rumil_tokens` actually being
   fast, the fast path was a UX bug.

   Fixed by always going through `_redraw` on character insertion.
   Keywords and pipe op names now colour as soon as the trigger
   character is typed.
`map(t<TAB>` now offers `text`, `to_entries`, `type` (whichever accept
the element shape) instead of doing nothing useful. Inside `map(...)`
and `filter(...)`, bare pipe-op names like `text`, `length`, `to_entries`
are legal expressions in lambé (sugar for `. | op`), so the completer
should offer them as candidates when the user is typing a partial name
without a leading `.`.

Implementation: a third remainder context parser (`_bareIdentCtx`)
matches a partial identifier with no leading `.` or `|`. When the
parsed AST is a `Pipe` with a parameterised op and the remainder
matches a non-empty bare identifier, candidates are pipe ops accepted
on the element shape of the pipe input.

Surfaced during the 0.9.0 live REPL smoke test: the new `text` op makes
`map(text)` a useful and discoverable pattern, but the completer
couldn't help users find it. Five new tests pin the behaviour:

- `map(t` → t-prefix pipe ops accepted on element shape
- `filter(le` → `length` (accepts list element shape)
- `map(.t` → field completion takes precedence (dot present)
- `map(` → field completion (empty bare partial doesn't trigger)
- bare `t` at top level → no pipe-op completion (not in pipe ctx)

100/100 completer tests pass (was 95).
…ke test

Two REPL-related fixes landed during the 0.9.0 live smoke test:
keyword-colouring for pipe op names (highlighter), and Tab completion
for bare pipe ops inside `map(...)` / `filter(...)` (completer). Both
are user-visible behaviour changes worth documenting under their
existing or new REPL subsections.
Surfaced during the 0.9.0 live REPL smoke test. CHANGELOG paragraphs
have soft line wraps in source ("queries\nagainst it"), and the prior
empty-string-on-break behaviour produced "queriesagainst it" — words
ran together at every wrap. mdast-util-to-string has the same default
and is widely cited as awkward for this reason.

New behaviour:
- soft_break (single newline in source, paragraph continuation) →
  ' '. Preserves word boundaries without imposing line structure.
- hard_break (`\` at end of line or two trailing spaces, explicit
  break) → '\n'. Preserves authorial intent. Users who want a fully
  flat string can post-process with a whitespace collapser.

Deliberate divergence from mdast-util-to-string. The op's docstring
documents the choice; new tests pin both behaviours.

CommonMark parsers emit hard_break + soft_break in sequence for
"hello  \nworld" (the explicit break followed by the line wrap), so
the hard-break test asserts the relevant invariants (newline present,
words present) rather than a literal expected-string match.
Surfaced during the 0.9.0 live REPL smoke test:
`.children | map(.<TAB>` on a real markdown CHANGELOG returned no
candidates, even though every visible child is a {type, level,
children} map. The static shape system collapses heterogeneous lists
to `SList<SAny>` (correct, conservative), which gives completion no
field hints to offer.

Fix: when the static element shape resolves to `SAny` and we have the
underlying `data`, navigate the actual values along the pipe's input
AST and shape-of the first element. The recovered shape feeds back
into the existing field-completion path. No structural change to the
Shape ADT.

`_navigate` is deliberately restricted to a small AST shape
(Identity / Field / Access / Index, plus shape-preserving pipe ops
filter / sort / sort_by / unique / unique_by / reverse). Per-element
ops like map / group_by / to_entries change the element family and
are excluded — better to give up than to guess at the new shape.

Completion never runs the user's query — only structural navigation —
so cost stays bounded (one `[0]` access per Pipe step).

Eight new tests pin: positive cases on a heterogeneous-by-`type` list
(map/filter/sort_by/reverse threading); empty list and null data
fall back gracefully to no candidates. 107/107 completer tests pass.
…tion fixes

Two more fixes from the 0.9.0 live REPL smoke test: `text` op's
break-handling change (soft → space, hard → newline) and the
completer's data-sampling fallback for heterogeneous lists. Both are
real user-visible behaviour changes.
`dart pub publish --dry-run` revealed three classes of dev-only
content shipping to pub.dev:

1. The local `lam` AOT binary (7MB, Linux x86_64). Useless to pub
   consumers — they'll either `dart run` from source or
   `dart pub global activate` which rebuilds. Now in `.pubignore`.

2. Scratch planning docs (`*.scratch.md` ~75KB). Already gitignored
   for the working tree, but `.pubignore` doesn't inherit from
   `.gitignore`, so they were leaking into published payloads. Added
   the matching `.pubignore` rule (with a comment noting why we
   repeat ourselves).

3. `tool/probe_completer.dart` — manual exploratory probe used during
   completer development. Test coverage in `test/completer_test.dart`
   has long since superseded it. Deleted.

Net: 3MB compressed → 214KB compressed. The package now contains only
what users need: source, tests, docs, executables (`bin/lam.dart`,
`bin/mcp_server.dart`, `tool/release_prep.sh`, `tool/manpage.dart`,
`tool/gen_version.dart`, `tool/lint_changelog.sh`, `install.sh`).
server.json is the MCP registry manifest template, regenerated by
.github/workflows/release.yml at release time and consumed by the
MCP registry publish flow. It carries a placeholder version
(`0.0.0-placeholder`) that would be actively misleading if shipped to
pub.dev. No runtime code references it; only `tool/release_prep.sh`
(release-time validation) and the GitHub workflow itself.

Pub clients never use it; exclude it.
The audit pass flagged that lambé's typed-ADT walks (Shape, LamExpr,
JsonValue) consistently use switch expressions, but four spots that
walk the untyped Object? JSON model still used is-cascade if-else
chains. Each had the same null/bool/num/String/List/Map shape; each
fits cleanly into a Dart 3 switch with type patterns.

Touched:
- evaluator.dart#_index — list/map/string indexing
- evaluator.dart#_slice — list/string slicing
- output.dart#_describeCellKind — list/map/runtimeType labelling
- repl.dart#_colorJson — null/bool/num/string/list/map JSON colorizer

Each conversion is length-equivalent or shorter; no behavior change;
exhaustively typed against the inhabited Object? cases. The pattern
"is-cascades belong at the typed/untyped boundary, not in the typed
core" now holds throughout.

1652/1652 tests pass.
…mendations

Two cleanups in bin/mcp_server.dart surfaced by the audit pass plus a
real find while reading the file:

1. The error path of every handler built `CallToolResult(content:
   [TextContent(text: ...)], isError: true)` inline — eight repetitions
   of the same boilerplate. Extracted to `_errorResult(message)`. The
   non-error result builder stays inline since it's rare and doesn't
   benefit from the abstraction.

2. The MCP server's `instructions` and `Markdown query patterns`
   blocks still recommended `.children[0].text` for heading text
   extraction, which is structurally wrong for non-trivial markdown
   (nested emphasis, links, inline code). The 0.9.0 `text` op was
   created exactly to replace this pattern; AGENTS.md and the recipes
   were updated in earlier commits, but the MCP instructions text
   hadn't been. Now uses `text` everywhere, with an explicit note
   about why `.children[0].text` was wrong.

The `_handleCheck` `{"ok": false, "error": ...}` JSON-shaped error
path stays inline — it's structurally different from the standard
"Error: ..." prefixed `isError: true` shape and shouldn't share the
helper.

1652/1652 tests pass.
…t switch

`parseAst` and `queryString` shared a near-identical
`switch (result) { Success() => value, Partial() | Failure() => throw ... }`
pattern. Audit flagged the duplication. queryString now calls parseAst
to get the AST, removing the second copy of the parse-error
rendering. The shared parse path means the error message for
queryString and parseAst is guaranteed to match exactly.

1652/1652 tests pass; no API surface change.
…pub.dev

The pre-0.9.0 audit pass + cross-vendor research found that lambé's
agent-facing docs were split across AGENTS.md (CLI cheat sheet) and
AI.md (natural-language → query table, syntax reference, markdown
data model) — but each agent platform reads only one of them, so no
agent saw the full picture.

Consolidates both into a single AGENTS.md focused on pure tool-use
guidance:
- "When / when not to" decision aid (from AI.md)
- Natural language → query table (the highest-leverage single artifact)
- Syntax reference (property access, pipeline ops, expressions)
- CLI flags worth knowing (-n, --null-input; --no-pretty; --explain;
  --print-shape; --schema; --assert; --ndjson; --flatten-cells)
- Markdown data model with the `text` op recommendation
  (replaces the broken `.children[0].text` pattern)
- Error patterns (null vs throw, OutputShapeError, parse error)
- Format auto-detection rules
- Library API one-liners + lambe_test matchers
- MCP server framed as the "if shell access isn't available"
  fallback, not the primary distribution

Excluded from the pub.dev publish payload via `.pubignore` because
pub.dev consumers are Dart developers, not AI agents working in a
checked-out repo. Same exclusion applies to `.claude/` and `.agents/`
(skill directories that may land later). The README.md is the
Dart-developer-facing surface; AGENTS.md is the agent-facing surface
on GitHub.

Updated `test/doc_examples_test.dart` to parse and evaluate every
`lam '...'` example in the consolidated AGENTS.md against a fixture,
guarding against doc drift the same way it did for AI.md before.
38/38 doc-example tests pass.
Adds a SKILL.md following the Agent Skills open standard
(agentskills.io / agentskills.so) — the cross-vendor format adopted by
Anthropic, Microsoft (Microsoft Agent Framework), Vercel, and others.

Key facts about the format:
- Discovery cost is ~100 tokens per skill (name + description in the
  system prompt at session start). Activation loads the full body only
  when the agent identifies a matching task.
- Format is byte-identical across vendors: YAML frontmatter
  (name, description, optional license/compatibility/metadata) +
  Markdown body, recommended ≤500 lines.
- Cross-tool: Claude (Code, .ai, API), Microsoft Agent Framework,
  agentskills.so registry. Gemini CLI also reads the format but is
  being sunset June 18 and replaced with Antigravity CLI; treat the
  Google angle as wobbly.

The skill body is a tighter subset of AGENTS.md — focused on "core
moves" plus the markdown data model (lambé's most distinctive
feature) plus the common gotchas. AGENTS.md remains the broader
reference loaded by repo-rooting agents (Cursor, Copilot, Claude
Code project mode); the skill is the focused entry point loaded
on-demand when the agent identifies a structured-data-query task.

`.gitignore` adjusted: `.claude/*` (not `.claude/`) so the negation
re-including `.claude/skills/` actually fires. Without this git
won't descend into directories ignored by name and `!` patterns
can't reach inside.

`.pubignore` already excludes `.claude/` from the pub.dev publish
payload; the skill is for GitHub / Agent-Skills-compatible clients,
not for Dart developers running `dart pub global activate lambe`.
@hakimjonas hakimjonas changed the title Test/rumil 0.7 Lambé 0.9.0 May 23, 2026
CI was using `dart-lang/setup-dart@v1.7.2` without an `sdk:` parameter,
which floats to whatever the stable channel's latest is at job time.
Locally we were on 3.11.4 and CI fetched 3.12.0, so the formatter
disagreed: 3.12.0 wraps a long `test('...', () => fail(...))` call
differently than 3.11.4 did, and the format job correctly flagged the
drift.

Pin all four jobs (analyze, format, test, lint-changelog) plus the
release workflow to `sdk: 3.12.0` so local and CI agree. Reformat the
single affected file under 3.12.0. Local Dart bumped to 3.12.0 to
match.

The same pin needs to land in rumil-dart's CI to keep the family
coherent — separate PR there.
GitHub Actions runners batch a child process's stdout in a way that
defeats the timing assertions: the parent test's `process.stdout.forEach`
receives all four lines together at EOF, even though `lam` itself
emits them line-by-line as they arrive (verified locally against TTY
and file-redirected stdout). The test is a useful local smoke check
that lambé's --ndjson actually flushes per line, but it isn't
reliable under CI's stdio plumbing.

Skip when CI=true is set; keep the assertion local. The test still
runs in every developer environment.
@hakimjonas hakimjonas merged commit ae41947 into main May 23, 2026
4 checks passed
@hakimjonas hakimjonas deleted the test/rumil-0.7 branch May 23, 2026 21:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant