Skip to content

Prove full completeness of the TextMate generator #51

@johnsoncodehk

Description

@johnsoncodehk

Summary

Prove the full completeness of src/gen-tm.ts: for every grammar expressible through the finite public combinators in src/api.ts, the generated TextMate grammar should recognize and scope every construct that the source grammar requires the TextMate layer to represent, subject only to explicitly proven limitations of the TextMate execution model.

This is broader than a soundness ledger. The target claim is completeness of the derivation from the DSL grammar to the emitted TextMate grammar.

Target Claim

For any grammar G built through the public API combinators and lowered through defineGrammar() into the closed RuleExpr algebra, every grammar-derived highlighting obligation that is representable in TextMate is emitted by generateTmLanguage(G) and is reachable in the generated repository/root patterns.

Equivalently:

  • no supported API combinator shape is silently ignored by gen-tm.ts;
  • no representable grammar-derived token/region/scope obligation is omitted from the emitted TextMate grammar;
  • every detector/emitter path has a completeness theorem, not only a soundness theorem;
  • any remaining gap must be justified by a proved TextMate-model limitation, not by an implementation shortcut.

Work Items

  1. Define the formal completeness statement for src/gen-tm.ts.

    • Specify the universe: grammars built only from src/api.ts public combinators.
    • Specify the target: TextMate-representable highlighting obligations.
    • Separate parser completeness, highlighter completeness, and TextMate engine expressiveness.
  2. Formalize the finite combinator algebra.

    • Prove API lowering closure from src/api.ts to the RuleExpr union in src/types.ts.
    • Define which RuleExpr constructors are transparent, zero-width, parser-only, highlighter-relevant, or TextMate-unrepresentable.
  3. Define the highlighting-obligation semantics.

    • For each token, rule shape, markup construct, interpolation, raw-text embed, indentation region, JSX construct, generic/cast disambiguator, regex literal, and contextual keyword family, define what TextMate output is required.
    • Make the obligation set finite and enumerable from a grammar.
  4. Prove detector completeness.

    • For each detector in src/gen-tm.ts, prove that every grammar shape requiring the corresponding TextMate construct is detected.
    • Start with small detectors, then cover the high-risk families: JSX/TSX, angle brackets, regex literals, markup/raw-text embeds, YAML indentation, declarations, contextual operators/modifiers, and expression-only embeds.
  5. Prove emitter completeness.

    • For each emitted repository entry/root pattern family, prove that the generated TextMate patterns cover every detected obligation.
    • Prove repository reachability: every required emitted entry is included from the root or from another reachable entry.
  6. Prove TextMate stack/line completeness under stated assumptions.

    • Model begin/end, while, captures, includes, $self, and repository recursion well enough to state what coverage the emitted grammar guarantees.
    • For limitations of line-local regex, Oniguruma recursion depth, or missing semantic state, either prove a workaround covers the obligation or record a precise impossibility lemma.
  7. Add a proof ledger.

    • Track every proof obligation by function family.
    • Include status, theorem statement, proof sketch or mechanized check, and test witness.
    • The ledger should make a future completeness percentage meaningful by using a fixed denominator.
  8. Connect proofs to tests.

    • Add witness tests for each completed lemma.
    • Ensure existing gates (npm run check, npm run ledger:check) remain empirical validation, not substitutes for the proof.

Non-goals

  • Do not replace existing README corpus metrics with this proof. The README metrics measure empirical agreement with external oracles; this issue concerns formal derivation completeness of gen-tm.ts.
  • Do not weaken the target to soundness-only proof. Soundness lemmas are useful prerequisites, but the issue is complete coverage of all TextMate-representable obligations.

Deliverable

A proof document and ledger, likely SOUNDNESS.md or COMPLETENESS.md, that states and tracks the full completeness proof for src/gen-tm.ts, plus any supporting tests or generated obligation reports needed to keep the proof auditable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions