feat(res-to-affine): tree-sitter AST walker for side-effect-import (Phase 2b, Refs #57)#322
Merged
Merged
Conversation
…a, Refs #57) Phase 2 of the `.res → .affine` migration assistant (#57) replaces the Phase-1 line-regex scanner with a tree-sitter AST walker. The grammar itself is manifest-vendored in `editors/tree-sitter-rescript/` (since #314), but the actual build pipeline that turns the manifest into a loadable parser had not been wired up. This commit is that wiring. - `justfile` gains an `install-grammar` recipe that wraps the existing `editors/tree-sitter-rescript/scripts/install.sh` so the bootstrap step is a discoverable `just` recipe alongside `build`/`test`/etc. - `editors/tree-sitter-rescript/scripts/install.sh` updates its error message when the `tree-sitter` CLI is missing to point at both the Rust-native install (`cargo install tree-sitter-cli`, the repo's preferred path for CLI tooling per CLAUDE.md) and the existing Node-based one (`npm install -g tree-sitter-cli`). The script behaviour is unchanged when the CLI is present. - `.github/workflows/ci.yml` gains a `migration-assistant` job that installs the tree-sitter CLI via npm (the fast CI path — a pre-built binary, ~5 s vs. ~5 min for `cargo install` from source), runs `just install-grammar`, verifies `tools/vendor/tree-sitter-rescript/ src/parser.c` was produced, and smoke-parses the existing `tools/res-to-affine/test/fixtures/sample.res` fixture. The job sits alongside `vscode-smoke` as a Node-using carve-out under the same reasoning (manifest dep on an npm-distributed tool whose binary output is what we actually consume; no new TS source in this repo). - `editors/tree-sitter-rescript/README.md` documents the dual install path, the new justfile recipe, and the CI gate. Phase 2b (the actual walker — `walker.ml`, AST-based detection of `side-effect-import`, `--engine` CLI flag, snapshot tests vs. the regex scanner) stacks on top of this branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…hase 2b, Refs #57) Lands the first AST-based detection pass on top of the Phase-2a grammar build pipeline (#321). New module `tools/res-to-affine/ walker.ml` shells out to the vendored `tree-sitter` CLI, parses its default s-expression output (with embedded `[row, col]` byte ranges), and walks the AST to find the `side-effect-import` anti-pattern structurally rather than via the column-0-anchored regex the Phase-1 scanner uses. Detection upgrade vs. Phase 1 ----------------------------- The walker flags `let _ = Mod.value` only when the binding sits at module top level — direct child of `source_file`, or direct child of a `block` that is the body of a `module_declaration`. The Phase-1 regex matches any column-0 occurrence of the same shape. On the existing `sample.res` fixture the walker reports 1 finding (the intended top-level `let _ = Pixi.Sound.register` on line 8); the scanner reports the same 1 plus would also have reported the same pattern nested inside a function body. The regex pattern that #319 band-aided with column-0 anchoring is eliminated structurally by the walker. CLI --- New `--engine={scanner,walker}` flag on the CLI (default: scanner, preserving Phase-1 behaviour); `--grammar-dir DIR` overrides the walker's parser location (default: `tools/vendor/tree-sitter-rescript`, matching `install.sh` output). When the walker hits any error (grammar missing, tree-sitter not on PATH, s-exp parse failure), it prints a one-line reason to stderr and falls back to the scanner — the CLI never bails because of walker problems. Tests ----- `test/test_walker.ml` adds two end-to-end tests under a new `res-to-affine-walker` alcotest run. Both auto-skip if the `tree-sitter` CLI is missing or the vendored grammar has not been built, so `dune runtest` on a fresh clone (no bootstrap) still passes. The repo-root discovery walks up from cwd looking for `dune-project` to find the source-tree grammar path; the dune sandbox otherwise hides it. CI -- The existing `build` job now installs `tree-sitter-cli` (npm, fast) and runs `install.sh` before `dune runtest`, so the walker tests actually execute rather than auto-skip. The migration-assistant gate added in #321 stays — it remains the dedicated first-signal job for grammar-pin drift. Scope discipline ---------------- `raw-js`, `untyped-exception`, `mutable-global` remain scanner-only; their AST counterparts land in Phase 2c. The walker exposes a stable surface (`Walker.scan : grammar_dir:string -> path:string -> source:string -> Scanner.finding list`) that 2c extends rather than re-architects. Stack: #321 (Phase 2a) → this PR (Phase 2b) → Phase 2c. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Lands the first AST-based detection pass on top of the Phase-2a grammar build pipeline (#321). New module
tools/res-to-affine/walker.mlshells out to the vendoredtree-sitterCLI, parses its default s-expression output (with embedded[row, col]byte ranges), and walks the AST to find theside-effect-importanti-pattern structurally rather than via the column-0-anchored regex Phase 1 uses.Detection upgrade vs. Phase 1
The walker flags
let _ = Mod.valueonly when the binding sits at module top level — direct child ofsource_file, or direct child of ablockthat is the body of amodule_declaration. The Phase-1 regex matches any column-0 occurrence of the same shape.On the existing
sample.resfixture the walker reports 1 finding (the intended top-levellet _ = Pixi.Sound.registeron line 8); the scanner reports the same 1 plus the same shape nested inside function bodies. The regex pattern that #319 band-aided with column-0 anchoring is eliminated structurally by the walker.(Walker covers only
side-effect-importin 2b — the other three anti-patterns are scanner-only until 2c.)CLI
--engine={scanner,walker}— defaultscanner, preserves Phase-1 behaviour.--grammar-dir DIR— override walker's parser location (default:tools/vendor/tree-sitter-rescript, matchesinstall.shoutput).tree-sitternot on PATH, s-exp parse failure), the binary prints a one-line reason to stderr and falls back to the scanner — never bails for walker problems.Walker design
[row, col]byte ranges and optionalfield:prefixes per child.cd grammar_dirtrick, not its config file). The walker redirects stderr to/dev/null; errors that matter still surface via the exit code.Walker.scan : grammar_dir:string -> path:string -> source:string -> Scanner.finding list. Phase 2c extends rather than re-architects.Tests
test/test_walker.mladds two end-to-end tests under a newres-to-affine-walkeralcotest run. Both auto-skip iftree-sitteris missing or the vendored grammar hasn't been built, sodune runteston a fresh clone (no bootstrap) still passes.Repo-root discovery walks up from the test cwd looking for
dune-project— necessary because the dune sandbox hides the source-treetools/vendor/from naive relative-path math.CI
buildjob now installstree-sitter-cli(npm, fast — pre-built binary, ~5 s) and runsinstall.shbeforedune runtest, so the walker tests actually execute rather than auto-skip.migration-assistantjob added in feat(res-to-affine): land tree-sitter grammar build pipeline (Phase 2a, Refs #57) #321 stays — it remains the dedicated first-signal job for grammar-pin drift, independent of the OCaml build.What is NOT in this PR
raw-js,untyped-exception,mutable-globalwalker ports → Phase 2c.inline lambda callback record,oversized function(the two anti-patterns deferred from Phase 1 entirely) → Phase 2c.--engine=walkerto default → Phase 2c, after the four anti-patterns reach parity.Stack
#321 (Phase 2a) → this PR (Phase 2b) → Phase 2c
Base is
mainper the orphan-trap guidance, so the diff is cumulative until #321 lands, then this rebases.Test plan
buildjob now runs walker tests; themigration-assistantjob still validates the grammar build.just install-grammar && dune runtest tools/res-to-affine/reports both walker tests[OK](verified).--engine=walkeronsample.resproduces exactly the line-8side-effect-importfinding (verified).🤖 Generated with Claude Code