Skip to content

feat(res-to-affine): Phase-1 migration assistant skeleton (Refs #57)#314

Merged
hyperpolymath merged 1 commit into
mainfrom
feat-57-res-to-affine-tree-sitter
May 20, 2026
Merged

feat(res-to-affine): Phase-1 migration assistant skeleton (Refs #57)#314
hyperpolymath merged 1 commit into
mainfrom
feat-57-res-to-affine-tree-sitter

Conversation

@hyperpolymath
Copy link
Copy Markdown
Owner

Summary

Lands the .res→.affine migration assistant proposed in #57 as a tight Phase-1 cut:

  • tools/res-to-affine/ — OCaml CLI built by the repo's dune toolchain. Text-scans a .res source for 4 of the 6 anti-patterns from idaptik's Wave 3 pilot (side-effect imports, %raw blocks, untyped exceptions / Promise.catch, mutable globals via :=) and emits a .affine skeleton with migration markers + the quoted original at the bottom for reference.
  • editors/tree-sitter-rescript/ — manifest-only vendoring of the canonical rescript-lang/tree-sitter-rescript grammar, pinned at commit 990214a (v6.0.0, MIT). Not used yet — wired up in Phase 2.
  • docs/MIGRATION-ASSISTANT.adoc — architectural decision: canonical grammar, three-phase plan, alternatives considered.
  • tools/res-to-affine/test/ — alcotest snapshot suite. Synthetic fixture exercises all four Phase-1 patterns; spot-checked against gitbot-fleet/bots/sustainabot/bot-integration/src/*.res (correctly finds 0 in Config.res/Webhook.res/Oikos.res, 2 in Main.res, 3 in GitHubAPI.res).

This PR uses Refs not Closes for #57 — the proposal is multi-phase and this is Phase 1 only. Phase 2 (tree-sitter AST walker) and Phase 3 (partial translation of pure-structural forms) are scoped in tools/res-to-affine/README.md and the ADR.

Cross-references: hyperpolymath/gitbot-fleet#148 (consumer — 2,133 LOC ReScript subtree blocked on #57).

Why Phase-1 is text-scan, not tree-sitter

The user picked tree-sitter integration as the long-run direction (right per LESSONS.md "tree-sitter as canonical grammar source"), but committing all of that in one PR would leave no shippable artefact behind if Phase 2 stalls. Phase 1 is deliberately small and useful in isolation:

  • Runs against any .res file today.
  • Detects 4 of 6 anti-patterns reliably (regex-based; the other two need real AST).
  • Same Emitter interface Phase 2 will use, so the swap is local.
  • Gates the architectural commitment behind something that already pays its way.

See docs/MIGRATION-ASSISTANT.adoc for the full rationale and the alternatives considered (bs-tools typed AST, hand-rolled OCaml parser, pattern-detector-only).

Test plan

  • dune build clean repo-wide (no regression to existing 80+ modules under lib/).
  • dune test tools/res-to-affine/test/ — 3/3 OK (snapshot, scanner kinds, module-name derivation).
  • Run against gitbot-fleet .res files end-to-end: scanner output matches expectations (clean files report 0, files with try/Js.Exn/Promise.catch report the right line numbers).
  • CI (this PR) — first run on push.

Files

docs/MIGRATION-ASSISTANT.adoc                          NEW  130 lines  ADR
editors/tree-sitter-rescript/README.md                 NEW   46 lines  vendoring manifest
editors/tree-sitter-rescript/package.json              NEW   17 lines
editors/tree-sitter-rescript/scripts/install.sh        NEW   38 lines  +x
tools/res-to-affine/README.md                          NEW  117 lines  usage + Phase plan
tools/res-to-affine/dune                               NEW   11 lines
tools/res-to-affine/main.ml                            NEW   77 lines  cmdliner CLI
tools/res-to-affine/scanner.{ml,mli}                   NEW  ~120 lines
tools/res-to-affine/emitter.{ml,mli}                   NEW  ~100 lines
tools/res-to-affine/test/{dune,test_emit.ml}           NEW   80 lines
tools/res-to-affine/test/fixtures/sample.res           NEW   30 lines
tools/res-to-affine/test/expected/sample.affine        NEW   60 lines  snapshot

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

Lands the .res→.affine migration assistant proposed in #57 as a tight
Phase-1 cut: an OCaml CLI under tools/res-to-affine/ that text-scans a
ReScript source for four of the six anti-patterns from idaptik's Wave 3
pilot (side-effect imports, %raw blocks, untyped exceptions / Promise.catch,
mutable globals via :=) and emits a .affine skeleton with migration
markers + the quoted original.

Architecture decision recorded in docs/MIGRATION-ASSISTANT.adoc:
canonical .res grammar is rescript-lang/tree-sitter-rescript, vendored
manifest-only under editors/tree-sitter-rescript/ (pinned commit
990214a, MIT, compatible with this repo's MPL-2.0). Phase 2 swaps the
text scanner for a tree-sitter AST walker; Phase 3 does partial
translation of pure-structural forms. See tools/res-to-affine/README.md
for the full plan and tools/res-to-affine/test/ for snapshot tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@hyperpolymath hyperpolymath marked this pull request as ready for review May 20, 2026 23:18
@hyperpolymath hyperpolymath merged commit 7e9f181 into main May 20, 2026
12 of 13 checks passed
@hyperpolymath hyperpolymath deleted the feat-57-res-to-affine-tree-sitter branch May 20, 2026 23:18
hyperpolymath added a commit that referenced this pull request May 21, 2026
…a, Refs #57) (#321)

* feat(res-to-affine): land tree-sitter grammar build pipeline (Phase 2a, Refs #57)

Phase 2 of the `.res → .affine` migration assistant (#57) replaces the
Phase-1 line-regex scanner with a tree-sitter AST walker. The grammar
itself is manifest-vendored in `editors/tree-sitter-rescript/` (since
#314), but the actual build pipeline that turns the manifest into a
loadable parser had not been wired up. This commit is that wiring.

- `justfile` gains an `install-grammar` recipe that wraps the existing
  `editors/tree-sitter-rescript/scripts/install.sh` so the bootstrap
  step is a discoverable `just` recipe alongside `build`/`test`/etc.
- `editors/tree-sitter-rescript/scripts/install.sh` updates its error
  message when the `tree-sitter` CLI is missing to point at both the
  Rust-native install (`cargo install tree-sitter-cli`, the repo's
  preferred path for CLI tooling per CLAUDE.md) and the existing
  Node-based one (`npm install -g tree-sitter-cli`). The script
  behaviour is unchanged when the CLI is present.
- `.github/workflows/ci.yml` gains a `migration-assistant` job that
  installs the tree-sitter CLI via npm (the fast CI path — a pre-built
  binary, ~5 s vs. ~5 min for `cargo install` from source), runs
  `just install-grammar`, verifies `tools/vendor/tree-sitter-rescript/
  src/parser.c` was produced, and smoke-parses the existing
  `tools/res-to-affine/test/fixtures/sample.res` fixture. The job sits
  alongside `vscode-smoke` as a Node-using carve-out under the same
  reasoning (manifest dep on an npm-distributed tool whose binary
  output is what we actually consume; no new TS source in this repo).
- `editors/tree-sitter-rescript/README.md` documents the dual install
  path, the new justfile recipe, and the CI gate.

Phase 2b (the actual walker — `walker.ml`, AST-based detection of
`side-effect-import`, `--engine` CLI flag, snapshot tests vs. the
regex scanner) stacks on top of this branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ci): drop `just` from migration-assistant job (not preinstalled)

The 2a `migration-assistant` job ran `just install-grammar`, but
GitHub Actions ubuntu-latest runners do not ship the `just` task
runner preinstalled. Direct script invocation
`./editors/tree-sitter-rescript/scripts/install.sh` is equivalent
and removes the implicit dependency on a tool nothing else in this
workflow uses. The justfile recipe stays for local developer
ergonomics.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hyperpolymath added a commit that referenced this pull request May 21, 2026
…hase 2b, Refs #57) (#322)

* feat(res-to-affine): land tree-sitter grammar build pipeline (Phase 2a, Refs #57)

Phase 2 of the `.res → .affine` migration assistant (#57) replaces the
Phase-1 line-regex scanner with a tree-sitter AST walker. The grammar
itself is manifest-vendored in `editors/tree-sitter-rescript/` (since
#314), but the actual build pipeline that turns the manifest into a
loadable parser had not been wired up. This commit is that wiring.

- `justfile` gains an `install-grammar` recipe that wraps the existing
  `editors/tree-sitter-rescript/scripts/install.sh` so the bootstrap
  step is a discoverable `just` recipe alongside `build`/`test`/etc.
- `editors/tree-sitter-rescript/scripts/install.sh` updates its error
  message when the `tree-sitter` CLI is missing to point at both the
  Rust-native install (`cargo install tree-sitter-cli`, the repo's
  preferred path for CLI tooling per CLAUDE.md) and the existing
  Node-based one (`npm install -g tree-sitter-cli`). The script
  behaviour is unchanged when the CLI is present.
- `.github/workflows/ci.yml` gains a `migration-assistant` job that
  installs the tree-sitter CLI via npm (the fast CI path — a pre-built
  binary, ~5 s vs. ~5 min for `cargo install` from source), runs
  `just install-grammar`, verifies `tools/vendor/tree-sitter-rescript/
  src/parser.c` was produced, and smoke-parses the existing
  `tools/res-to-affine/test/fixtures/sample.res` fixture. The job sits
  alongside `vscode-smoke` as a Node-using carve-out under the same
  reasoning (manifest dep on an npm-distributed tool whose binary
  output is what we actually consume; no new TS source in this repo).
- `editors/tree-sitter-rescript/README.md` documents the dual install
  path, the new justfile recipe, and the CI gate.

Phase 2b (the actual walker — `walker.ml`, AST-based detection of
`side-effect-import`, `--engine` CLI flag, snapshot tests vs. the
regex scanner) stacks on top of this branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(res-to-affine): tree-sitter AST walker for side-effect-import (Phase 2b, Refs #57)

Lands the first AST-based detection pass on top of the Phase-2a
grammar build pipeline (#321). New module `tools/res-to-affine/
walker.ml` shells out to the vendored `tree-sitter` CLI, parses its
default s-expression output (with embedded `[row, col]` byte ranges),
and walks the AST to find the `side-effect-import` anti-pattern
structurally rather than via the column-0-anchored regex the Phase-1
scanner uses.

Detection upgrade vs. Phase 1
-----------------------------

The walker flags `let _ = Mod.value` only when the binding sits at
module top level — direct child of `source_file`, or direct child of
a `block` that is the body of a `module_declaration`. The Phase-1
regex matches any column-0 occurrence of the same shape. On the
existing `sample.res` fixture the walker reports 1 finding (the
intended top-level `let _ = Pixi.Sound.register` on line 8); the
scanner reports the same 1 plus would also have reported the same
pattern nested inside a function body. The regex pattern that #319
band-aided with column-0 anchoring is eliminated structurally by
the walker.

CLI
---

New `--engine={scanner,walker}` flag on the CLI (default: scanner,
preserving Phase-1 behaviour); `--grammar-dir DIR` overrides the
walker's parser location (default: `tools/vendor/tree-sitter-rescript`,
matching `install.sh` output). When the walker hits any error
(grammar missing, tree-sitter not on PATH, s-exp parse failure), it
prints a one-line reason to stderr and falls back to the scanner —
the CLI never bails because of walker problems.

Tests
-----

`test/test_walker.ml` adds two end-to-end tests under a new
`res-to-affine-walker` alcotest run. Both auto-skip if the
`tree-sitter` CLI is missing or the vendored grammar has not been
built, so `dune runtest` on a fresh clone (no bootstrap) still
passes. The repo-root discovery walks up from cwd looking for
`dune-project` to find the source-tree grammar path; the dune sandbox
otherwise hides it.

CI
--

The existing `build` job now installs `tree-sitter-cli` (npm, fast)
and runs `install.sh` before `dune runtest`, so the walker tests
actually execute rather than auto-skip. The migration-assistant
gate added in #321 stays — it remains the dedicated first-signal
job for grammar-pin drift.

Scope discipline
----------------

`raw-js`, `untyped-exception`, `mutable-global` remain
scanner-only; their AST counterparts land in Phase 2c. The walker
exposes a stable surface (`Walker.scan : grammar_dir:string ->
path:string -> source:string -> Scanner.finding list`) that 2c
extends rather than re-architects.

Stack: #321 (Phase 2a) → this PR (Phase 2b) → Phase 2c.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔍 Hypatia Security Scan

Findings: 42 issues detected

Severity Count
🔴 Critical 13
🟠 High 17
🟡 Medium 12

⚠️ Action Required: Critical security issues found!

View findings
[
  {
    "reason": "Stray AI.a2ml in root -- use 0-AI-MANIFEST.a2ml only",
    "type": "banned",
    "file": "AI.a2ml",
    "action": "delete",
    "rule_module": "root_hygiene",
    "severity": "high"
  },
  {
    "reason": "Superseded by 0-AI-MANIFEST.a2ml",
    "type": "banned",
    "file": "AI.djot",
    "action": "delete",
    "rule_module": "root_hygiene",
    "severity": "high"
  },
  {
    "reason": "Issue in quality.yml",
    "type": "missing_workflow",
    "file": "quality.yml",
    "action": "create",
    "rule_module": "workflow_audit",
    "severity": "high"
  },
  {
    "reason": "Issue in security-policy.yml",
    "type": "missing_workflow",
    "file": "security-policy.yml",
    "action": "create",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Action hyperpolymath/standards/.github/workflows/governance-reusable.yml@main needs attention",
    "type": "unpinned_action",
    "file": "governance.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "high"
  },
  {
    "reason": "Action actions/checkout@v4 needs attention",
    "type": "unpinned_action",
    "file": "publish-jsr.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Action denoland/setup-deno@v2 needs attention",
    "type": "unpinned_action",
    "file": "publish-jsr.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "TypeScript file detected -- banned language",
    "type": "banned_language_file",
    "file": "/home/runner/work/affinescript/affinescript/affinescript-deno-test/example/smoke_driver.ts",
    "action": "flag",
    "rule_module": "cicd_rules",
    "severity": "critical"
  },
  {
    "reason": "TypeScript file detected -- banned language",
    "type": "banned_language_file",
    "file": "/home/runner/work/affinescript/affinescript/affinescript-deno-test/cli.ts",
    "action": "flag",
    "rule_module": "cicd_rules",
    "severity": "critical"
  },
  {
    "reason": "TypeScript file detected -- banned language",
    "type": "banned_language_file",
    "file": "/home/runner/work/affinescript/affinescript/affinescript-deno-test/mod.ts",
    "action": "flag",
    "rule_module": "cicd_rules",
    "severity": "critical"
  }
]

Powered by Hypatia Neurosymbolic CI/CD Intelligence

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant