This document defines the concrete implementation path for migrating native.js from a direct TypeScript-to-C compiler to a multi-phase architecture with an IR and multiple backends.
TypeScript Source
↓
Frontend (TypeScript → Typed AST)
↓
HIR (High-level IR)
↓
MIR (Mid-level IR with explicit memory)
↓
Backend (C or C++)
↓
Target Runtime
packages/
├── nativejs-compiler # Existing - stays as entry point
├── nativejs-ir # NEW: IR types and transformations
├── nativejs-backend-c # NEW: C code generation
├── nativejs-backend-cpp # NEW: C++ code generation (future)
├── nativejs-runtime-core # NEW: Target-independent runtime
├── nativejs-runtime-c # NEW: C runtime implementation
├── nativejs-runtime-cpp # NEW: C++ runtime implementation (future)
├── nativejs-preset-standard # Existing - migrates to new backend contract
├── nativejs-preset-arduino # Existing - migrates to new backend contract
└── nativejs-preset-rp2040 # Existing - first to use new backend
- Define IR type system
- Create new packages
- Build minimal end-to-end path
-
Create
packages/nativejs-irsrc/types.ts: Core IR types (HIR, MIR)src/hir.ts: High-level IR definitionssrc/mir.ts: Mid-level IR definitionssrc/lower.ts: HIR → MIR lowering
-
Core IR Types (initial):
// HIR: High-level IR (TypeScript-like)
export interface HIRProgram {
functions: HIRFunction[];
globals: HIRGlobal[];
}
export interface HIRFunction {
name: string;
params: HIRParam[];
returnType: Type;
body: HIRStatement[];
}
export type HIRStatement =
| { kind: 'var_decl'; name: string; type: Type; init?: HIRExpr }
| { kind: 'assign'; target: HIRTarget; value: HIRExpr }
| { kind: 'if'; condition: HIRExpr; then: HIRStatement[]; else?: HIRStatement[] }
| { kind: 'while'; condition: HIRExpr; body: HIRStatement[] }
| { kind: 'return'; value?: HIRExpr }
| { kind: 'expr_stmt'; expr: HIRExpr };
export type HIRExpr =
| { kind: 'number'; value: number }
| { kind: 'string'; value: string }
| { kind: 'bool'; value: boolean }
| { kind: 'var'; name: string }
| { kind: 'binary'; op: string; left: HIRExpr; right: HIRExpr }
| { kind: 'call'; func: string; args: HIRExpr[] }
| { kind: 'array_new'; elementType: Type; size?: HIRExpr }
| { kind: 'array_get'; arr: HIRExpr; index: HIRExpr }
| { kind: 'array_set'; arr: HIRExpr; index: HIRExpr; value: HIRExpr }
| { kind: 'struct_new'; type: string; fields: Record<string, HIRExpr> }
| { kind: 'struct_get'; obj: HIRExpr; field: string }
| { kind: 'struct_set'; obj: HIRExpr; field: string; value: HIRExpr };
// Types
export type Type =
| { kind: 'void' }
| { kind: 'number' }
| { kind: 'string' }
| { kind: 'bool' }
| { kind: 'array'; element: Type }
| { kind: 'struct'; name: string };-
Create
packages/nativejs-ir/src/frontend.ts- Convert TS AST → HIR
- Handle: numbers, strings, bools, variables, binary ops, functions
-
Create
packages/nativejs-ir/src/lower.ts- HIR → MIR lowering
- Convert high-level constructs to explicit memory operations
-
Create
packages/nativejs-backend-csrc/generator.ts: MIR → C code- Handle basic types and control flow
-
Integration test:
- Input:
function add(a: number, b: number): number { return a + b; } - Output: Valid C that compiles and runs
- Input:
- Complete essential TypeScript features
- Side-by-side comparison with legacy compiler
-
Variables and Types (Week 3)
- let/const
- number, string, boolean
- Type annotations
- Type inference
-
Control Flow (Week 3)
- if/else
- while loops
- for loops
- break/continue
-
Functions (Week 4)
- Function declarations
- Function expressions
- Arrow functions
- Parameters and return types
- Recursion
-
Arrays (Week 5)
- Array literals
- Dynamic arrays (array_new)
- push/pop
- Index access
- Length
-
Objects/Structs (Week 6)
- Object literals
- Property access
- Methods
- Nested objects
For each feature:
- Unit test: TS → HIR → MIR snapshots
- Backend test: MIR → C golden output
- Integration test: Compile → Run → Verify output
- Comparison test: Legacy vs new output (should match behavior)
- Migrate standard preset
- Support console.log, basic math, strings
// HIR intrinsics (target-independent)
export interface HIRIntrinsics {
// Console
console_log: (values: HIRExpr[]) => void;
// Math
math_floor: (x: HIRExpr) => HIRExpr;
math_ceil: (x: HIRExpr) => HIRExpr;
math_abs: (x: HIRExpr) => HIRExpr;
math_min: (a: HIRExpr, b: HIRExpr) => HIRExpr;
math_max: (a: HIRExpr, b: HIRExpr) => HIRExpr;
// Strings
string_concat: (a: HIRExpr, b: HIRExpr) => HIRExpr;
string_length: (s: HIRExpr) => HIRExpr;
}Create packages/nativejs-runtime-c:
src/console.c: printf-based console outputsrc/array.c: Dynamic array implementationsrc/string.c: String operationsinclude/: Headers
- Migrate RP2040 preset to new architecture
- Demonstrate target-specific intrinsics
// RP2040 GPIO (target-specific HIR)
export interface RP2040Intrinsics {
gpio_init: (pin: HIRExpr, mode: HIRExpr) => void;
gpio_write: (pin: HIRExpr, value: HIRExpr) => void;
gpio_read: (pin: HIRExpr) => HIRExpr;
sleep_ms: (ms: HIRExpr) => void;
millis: () => HIRExpr;
}- Keep preset structure (headers, plugins, examples)
- Change plugin contract:
- OLD: Plugin directly emits C via CodeTemplate
- NEW: Plugin generates HIR intrinsics, backend maps to target C
- Add C++ backend
- Leverage RAII and STL for cleaner output
-
RAII Memory Management
- Smart pointers for heap allocation
- Automatic destructors
- No manual malloc/free in generated code
-
STL Containers
- std::vector for dynamic arrays
- std::string for strings
- std::array for fixed arrays
-
References
- Pass by reference for efficiency
- Avoid pointer arithmetic where possible
-
Templates
- Generic functions when type-safe
- Type-safe containers
- IR types defined
- Basic TS → HIR → C pipeline working
- Can compile: numbers, variables, binary ops, functions
- One integration test passes
- All primitive types
- Control flow (if/while/for)
- Functions complete
- 80% of existing tests pass
- Arrays with all operations
- Objects/structs
- Standard library (console, math)
- Standard preset migrated
- 95% test parity
- Arduino preset migrated
- RP2040 preset migrated
- New backend contract proven
- 100% feature parity
- C++ backend implemented
- RAII memory management
- STL containers
- RP2040 uses C++ backend
- Arduino keeps C backend
- LLVM backend (optional)
- WASM backend (optional)
- Source maps
- Optimization passes
- Debugger support
// Test HIR generation
describe('Frontend', () => {
it('generates HIR for simple function', () => {
const ts = 'function add(a: number, b: number): number { return a + b; }';
const hir = compileToHIR(ts);
expect(hir).toMatchSnapshot();
});
});// Test C generation
describe('C Backend', () => {
it('generates correct C for function', () => {
const mir = parseMIR(/* ... */);
const c = generateC(mir);
expect(c).toContain('int16_t add(int16_t a, int16_t b)');
});
});// End-to-end
describe('E2E', () => {
it('compiles and runs', async () => {
const ts = 'console.log(1 + 2);';
const output = await compileAndRun(ts);
expect(output).toBe('3');
});
});Mitigation: Keep legacy compiler as nativejs-compiler-legacy package. Use feature flag to switch. Only remove after 100% parity.
Mitigation: Start simple. Don't over-design SSA. HIR/MIR separation is enough for v1. Add SSA later if needed.
Mitigation: Benchmark early. Compare output size and execution speed. Optimization passes come after correctness.
- Start with smallest possible feature slice
- Make IR serializable (JSON) for debugging
- Keep old compiler working during migration
- Profile before optimizing
- Document every IR instruction
- Use TypeScript strict mode
- Write tests for every phase
- Delete old compiler early
- Over-design SSA on day 1
- Try to support all TypeScript features immediately
- Use strings for IR values (use structured objects)
- Make IR too low-level too early
- Skip testing lower phases
- Ignore binary size on embedded targets
Create the packages and implement:
// Input
function add(a: number, b: number): number {
return a + b;
}
console.log(add(1, 2));End-to-end through new architecture. If this works, the foundation is solid.
- Correctness: All existing tests pass
- Performance: No regression in output size or speed
- Maintainability: Clear phase separation, well-tested
- Extensibility: New backend can be added in <1 week
- Usability: Same or better developer experience
This plan provides a concrete, incremental path from the current architecture to a multi-phase compiler with IR and multiple backends. The key is small steps, continuous testing, and maintaining the legacy compiler until the new one is proven.
Ready to begin implementation.