Skip to content

Latest commit

 

History

History
393 lines (313 loc) · 10.5 KB

File metadata and controls

393 lines (313 loc) · 10.5 KB

Native.js Compiler Implementation Plan

Executive Summary

This document defines the concrete implementation path for migrating native.js from a direct TypeScript-to-C compiler to a multi-phase architecture with an IR and multiple backends.

Architecture

TypeScript Source
    ↓
Frontend (TypeScript → Typed AST)
    ↓
HIR (High-level IR)
    ↓
MIR (Mid-level IR with explicit memory)
    ↓
Backend (C or C++)
    ↓
Target Runtime

Package Structure

packages/
├── nativejs-compiler          # Existing - stays as entry point
├── nativejs-ir                # NEW: IR types and transformations
├── nativejs-backend-c         # NEW: C code generation
├── nativejs-backend-cpp       # NEW: C++ code generation (future)
├── nativejs-runtime-core      # NEW: Target-independent runtime
├── nativejs-runtime-c         # NEW: C runtime implementation
├── nativejs-runtime-cpp       # NEW: C++ runtime implementation (future)
├── nativejs-preset-standard   # Existing - migrates to new backend contract
├── nativejs-preset-arduino    # Existing - migrates to new backend contract
└── nativejs-preset-rp2040     # Existing - first to use new backend

Phase 1: Foundation (Weeks 1-2)

Goals

  • Define IR type system
  • Create new packages
  • Build minimal end-to-end path

Tasks

Week 1: IR Types and Package Setup

  1. Create packages/nativejs-ir

    • src/types.ts: Core IR types (HIR, MIR)
    • src/hir.ts: High-level IR definitions
    • src/mir.ts: Mid-level IR definitions
    • src/lower.ts: HIR → MIR lowering
  2. Core IR Types (initial):

// HIR: High-level IR (TypeScript-like)
export interface HIRProgram {
  functions: HIRFunction[];
  globals: HIRGlobal[];
}

export interface HIRFunction {
  name: string;
  params: HIRParam[];
  returnType: Type;
  body: HIRStatement[];
}

export type HIRStatement =
  | { kind: 'var_decl'; name: string; type: Type; init?: HIRExpr }
  | { kind: 'assign'; target: HIRTarget; value: HIRExpr }
  | { kind: 'if'; condition: HIRExpr; then: HIRStatement[]; else?: HIRStatement[] }
  | { kind: 'while'; condition: HIRExpr; body: HIRStatement[] }
  | { kind: 'return'; value?: HIRExpr }
  | { kind: 'expr_stmt'; expr: HIRExpr };

export type HIRExpr =
  | { kind: 'number'; value: number }
  | { kind: 'string'; value: string }
  | { kind: 'bool'; value: boolean }
  | { kind: 'var'; name: string }
  | { kind: 'binary'; op: string; left: HIRExpr; right: HIRExpr }
  | { kind: 'call'; func: string; args: HIRExpr[] }
  | { kind: 'array_new'; elementType: Type; size?: HIRExpr }
  | { kind: 'array_get'; arr: HIRExpr; index: HIRExpr }
  | { kind: 'array_set'; arr: HIRExpr; index: HIRExpr; value: HIRExpr }
  | { kind: 'struct_new'; type: string; fields: Record<string, HIRExpr> }
  | { kind: 'struct_get'; obj: HIRExpr; field: string }
  | { kind: 'struct_set'; obj: HIRExpr; field: string; value: HIRExpr };

// Types
export type Type =
  | { kind: 'void' }
  | { kind: 'number' }
  | { kind: 'string' }
  | { kind: 'bool' }
  | { kind: 'array'; element: Type }
  | { kind: 'struct'; name: string };

Week 2: Frontend and First Backend

  1. Create packages/nativejs-ir/src/frontend.ts

    • Convert TS AST → HIR
    • Handle: numbers, strings, bools, variables, binary ops, functions
  2. Create packages/nativejs-ir/src/lower.ts

    • HIR → MIR lowering
    • Convert high-level constructs to explicit memory operations
  3. Create packages/nativejs-backend-c

    • src/generator.ts: MIR → C code
    • Handle basic types and control flow
  4. Integration test:

    • Input: function add(a: number, b: number): number { return a + b; }
    • Output: Valid C that compiles and runs

Phase 2: Core Language (Weeks 3-6)

Goals

  • Complete essential TypeScript features
  • Side-by-side comparison with legacy compiler

Features

  1. Variables and Types (Week 3)

    • let/const
    • number, string, boolean
    • Type annotations
    • Type inference
  2. Control Flow (Week 3)

    • if/else
    • while loops
    • for loops
    • break/continue
  3. Functions (Week 4)

    • Function declarations
    • Function expressions
    • Arrow functions
    • Parameters and return types
    • Recursion
  4. Arrays (Week 5)

    • Array literals
    • Dynamic arrays (array_new)
    • push/pop
    • Index access
    • Length
  5. Objects/Structs (Week 6)

    • Object literals
    • Property access
    • Methods
    • Nested objects

Testing Strategy

For each feature:

  1. Unit test: TS → HIR → MIR snapshots
  2. Backend test: MIR → C golden output
  3. Integration test: Compile → Run → Verify output
  4. Comparison test: Legacy vs new output (should match behavior)

Phase 3: Standard Library and Presets (Weeks 7-10)

Goals

  • Migrate standard preset
  • Support console.log, basic math, strings

Standard Library Design

// HIR intrinsics (target-independent)
export interface HIRIntrinsics {
  // Console
  console_log: (values: HIRExpr[]) => void;
  
  // Math
  math_floor: (x: HIRExpr) => HIRExpr;
  math_ceil: (x: HIRExpr) => HIRExpr;
  math_abs: (x: HIRExpr) => HIRExpr;
  math_min: (a: HIRExpr, b: HIRExpr) => HIRExpr;
  math_max: (a: HIRExpr, b: HIRExpr) => HIRExpr;
  
  // Strings
  string_concat: (a: HIRExpr, b: HIRExpr) => HIRExpr;
  string_length: (s: HIRExpr) => HIRExpr;
}

Runtime Implementation

Create packages/nativejs-runtime-c:

  • src/console.c: printf-based console output
  • src/array.c: Dynamic array implementation
  • src/string.c: String operations
  • include/: Headers

Phase 4: Target Presets (Weeks 11-14)

Goals

  • Migrate RP2040 preset to new architecture
  • Demonstrate target-specific intrinsics

Target Intrinsics

// RP2040 GPIO (target-specific HIR)
export interface RP2040Intrinsics {
  gpio_init: (pin: HIRExpr, mode: HIRExpr) => void;
  gpio_write: (pin: HIRExpr, value: HIRExpr) => void;
  gpio_read: (pin: HIRExpr) => HIRExpr;
  sleep_ms: (ms: HIRExpr) => void;
  millis: () => HIRExpr;
}

Preset Migration

  1. Keep preset structure (headers, plugins, examples)
  2. Change plugin contract:
    • OLD: Plugin directly emits C via CodeTemplate
    • NEW: Plugin generates HIR intrinsics, backend maps to target C

Phase 5: C++ Backend (Weeks 15-20)

Goals

  • Add C++ backend
  • Leverage RAII and STL for cleaner output

C++ Backend Features

  1. RAII Memory Management

    • Smart pointers for heap allocation
    • Automatic destructors
    • No manual malloc/free in generated code
  2. STL Containers

    • std::vector for dynamic arrays
    • std::string for strings
    • std::array for fixed arrays
  3. References

    • Pass by reference for efficiency
    • Avoid pointer arithmetic where possible
  4. Templates

    • Generic functions when type-safe
    • Type-safe containers

Implementation Milestones

Week 1-2: MVP

  • IR types defined
  • Basic TS → HIR → C pipeline working
  • Can compile: numbers, variables, binary ops, functions
  • One integration test passes

Month 1 (Weeks 3-4): Core Language

  • All primitive types
  • Control flow (if/while/for)
  • Functions complete
  • 80% of existing tests pass

Month 2 (Weeks 5-8): Data Structures

  • Arrays with all operations
  • Objects/structs
  • Standard library (console, math)
  • Standard preset migrated
  • 95% test parity

Month 3 (Weeks 9-12): Target Presets

  • Arduino preset migrated
  • RP2040 preset migrated
  • New backend contract proven
  • 100% feature parity

Month 4-5 (Weeks 13-20): C++ Backend

  • C++ backend implemented
  • RAII memory management
  • STL containers
  • RP2040 uses C++ backend
  • Arduino keeps C backend

Month 6+ (Week 21+): Advanced Features

  • LLVM backend (optional)
  • WASM backend (optional)
  • Source maps
  • Optimization passes
  • Debugger support

Testing Strategy

Unit Tests

// Test HIR generation
describe('Frontend', () => {
  it('generates HIR for simple function', () => {
    const ts = 'function add(a: number, b: number): number { return a + b; }';
    const hir = compileToHIR(ts);
    expect(hir).toMatchSnapshot();
  });
});

Backend Tests

// Test C generation
describe('C Backend', () => {
  it('generates correct C for function', () => {
    const mir = parseMIR(/* ... */);
    const c = generateC(mir);
    expect(c).toContain('int16_t add(int16_t a, int16_t b)');
  });
});

Integration Tests

// End-to-end
describe('E2E', () => {
  it('compiles and runs', async () => {
    const ts = 'console.log(1 + 2);';
    const output = await compileAndRun(ts);
    expect(output).toBe('3');
  });
});

Risk Mitigation

Risk: Migration takes too long

Mitigation: Keep legacy compiler as nativejs-compiler-legacy package. Use feature flag to switch. Only remove after 100% parity.

Risk: IR design doesn't fit

Mitigation: Start simple. Don't over-design SSA. HIR/MIR separation is enough for v1. Add SSA later if needed.

Risk: Performance regression

Mitigation: Benchmark early. Compare output size and execution speed. Optimization passes come after correctness.

Do's and Don'ts

DO

  • Start with smallest possible feature slice
  • Make IR serializable (JSON) for debugging
  • Keep old compiler working during migration
  • Profile before optimizing
  • Document every IR instruction
  • Use TypeScript strict mode
  • Write tests for every phase

DON'T

  • Delete old compiler early
  • Over-design SSA on day 1
  • Try to support all TypeScript features immediately
  • Use strings for IR values (use structured objects)
  • Make IR too low-level too early
  • Skip testing lower phases
  • Ignore binary size on embedded targets

First Task: This Week

Create the packages and implement:

// Input
function add(a: number, b: number): number {
  return a + b;
}
console.log(add(1, 2));

End-to-end through new architecture. If this works, the foundation is solid.

Success Criteria

  1. Correctness: All existing tests pass
  2. Performance: No regression in output size or speed
  3. Maintainability: Clear phase separation, well-tested
  4. Extensibility: New backend can be added in <1 week
  5. Usability: Same or better developer experience

Conclusion

This plan provides a concrete, incremental path from the current architecture to a multi-phase compiler with IR and multiple backends. The key is small steps, continuous testing, and maintaining the legacy compiler until the new one is proven.

Ready to begin implementation.