This project is positioned at the intersection of:
- multi-frontend compilation into a shared core
- syntax extensibility and macro-style surface design
- controlled natural-language-inspired frontend design
It does not claim novelty for the general idea of many languages mapping to one core. The contribution is a coordinated multilingual frontend family with grammar-sensitive normalization targeting one formal core.
Established systems already map multiple source languages to shared internal representations (for example LLVM-style IR ecosystems, GCC internals, and core calculi such as lambda calculus/System F in language theory contexts).
Alignment with this project:
- many-to-one mapping from surface language to shared core
- separation of frontend concerns from backend/codegen
Difference in focus:
- this project coordinates multiple natural-language-inspired frontend variants within one family, rather than unrelated general-purpose languages.
Work on pluggable syntax and macro systems demonstrates that one semantic core
can support many surface forms (for example Racket #lang, macro-centric
language extension work, and syntax-skin style ideas).
In Racket specifically, #lang selects a language that controls module parsing
at both reader and expander levels, and supports packaging/installing new
languages. This is a strong precedent for architecture where language identity
is explicit at the module boundary while implementation is shared.
Alignment with this project:
- core-first architecture
- syntactic variation layered on stable semantics
Difference in focus:
- this project emphasizes multilingual frontends and data-driven language onboarding, not a general macro metaprogramming framework.
Community feedback correctly highlights that once syntax approaches natural language, the hard problems shift from classic parsing to linguistic ambiguity, morphology, and intent extraction.
Project stance:
- explicitly controlled subsets per language (CNL-style)
- declarative and test-backed normalization rules
- deterministic parsing and forward compilation
- no claim of full natural-language understanding
The architecture draws the boundary as:
- Parsing/frontends:
CS_lang -> CoreAST - Core typing boundary:
CoreAST -> CoreIRProgram - Codegen/runtime:
CoreIRProgram -> Python -> execution
The project deliberately does not promise lossless round-tripping from core to original surface syntax.