|
| 1 | +# Glojure Codegen System |
| 2 | + |
| 3 | +This document provides guidance for understanding and working with Glojure's ahead-of-time (AOT) code generation system. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +The codegen package transforms Glojure AST nodes into Go source code, enabling ahead-of-time compilation. This is a work-in-progress alternative to the default tree-walking interpreter that offers potential performance benefits through static compilation. |
| 8 | + |
| 9 | +## Architecture |
| 10 | + |
| 11 | +### Compilation Pipeline |
| 12 | + |
| 13 | +``` |
| 14 | +Source (.glj) → Reader → S-expressions → Analyzer → AST → Codegen → Go Source → go build → Native Binary |
| 15 | + ↓ |
| 16 | + Runtime Eval (default path) |
| 17 | +``` |
| 18 | + |
| 19 | +### Key Components |
| 20 | + |
| 21 | +- **Generator** (pkg/codegen/codegen.go:32-46): Main code generation engine |
| 22 | + - Manages variable scopes and recur contexts |
| 23 | + - Handles output buffering and Go code formatting |
| 24 | + |
| 25 | +- **AST Nodes** (pkg/ast/ast.go:17-158): 44 different operation types |
| 26 | + - Each node has an `Op` field determining its type |
| 27 | + - `Sub` field contains op-specific data structures |
| 28 | + |
| 29 | +- **Analyzer** (pkg/compiler/analyze.go): Creates AST from S-expressions |
| 30 | + - Performs macro expansion (pkg/compiler/analyze.go:87-122) |
| 31 | + - Manages lexical environments (pkg/compiler/analyze.go:32-51) |
| 32 | + - Dispatches to specialized analyzers (pkg/compiler/analyze.go:196-408) |
| 33 | + |
| 34 | +## Current Implementation Status |
| 35 | + |
| 36 | +### ✅ Supported Features |
| 37 | + |
| 38 | +| Feature | Implementation | Reference | |
| 39 | +|---------|----------------|-----------| |
| 40 | +| Constants | Numbers, strings, keywords, booleans, nil | codegen.go:383-385 | |
| 41 | +| Local Variables | Let bindings, function parameters | codegen.go:386-389 | |
| 42 | +| Namespace Vars | Var dereference and lookup | codegen.go:410-433 | |
| 43 | +| Functions | Single/multi-arity, variadic | codegen.go:258-331 | |
| 44 | +| Let/Loop | Including loop/recur | codegen.go:555-614 | |
| 45 | +| Recur | Tail recursion within loops | codegen.go:616-658 | |
| 46 | +| If/Else | Conditional expressions | codegen.go:479-503 | |
| 47 | +| Do Blocks | Sequential evaluation | codegen.go:462-477 | |
| 48 | +| Function Calls | Via lang.Apply | codegen.go:435-460 | |
| 49 | +| Collections | Vectors, Maps | codegen.go:215-256 | |
| 50 | + |
| 51 | +### ❌ Not Yet Implemented |
| 52 | + |
| 53 | +- Host interop (., .., new) |
| 54 | +- Try/catch/finally |
| 55 | +- Case expressions |
| 56 | +- Set literals |
| 57 | +- Metadata on functions |
| 58 | +- deftype/defprotocol |
| 59 | +- Lazy sequences |
| 60 | +- Transducers |
| 61 | + |
| 62 | +## Code Generation Process |
| 63 | + |
| 64 | +### 1. Namespace Generation (codegen.go:50-132) |
| 65 | + |
| 66 | +```go |
| 67 | +func (g *Generator) Generate(ns *lang.Namespace) error |
| 68 | +``` |
| 69 | + |
| 70 | +- Iterates through namespace mappings |
| 71 | +- Generates init() function containing var definitions |
| 72 | +- Applies go fmt to output |
| 73 | + |
| 74 | +### 2. Var Generation (codegen.go:136-170) |
| 75 | + |
| 76 | +Each var becomes: |
| 77 | +```go |
| 78 | +{ |
| 79 | + varSym := lang.NewSymbol("var-name") |
| 80 | + var := ns.InternWithValue(varSym, value, true) |
| 81 | + // metadata handling... |
| 82 | +} |
| 83 | +``` |
| 84 | + |
| 85 | +### 3. Value Generation (codegen.go:173-213) |
| 86 | + |
| 87 | +Recursively generates Go expressions for Clojure values: |
| 88 | +- Primitives: Direct Go literals |
| 89 | +- Collections: `lang.NewVector(...)`, `lang.NewMap(...)` |
| 90 | +- Functions: `lang.IFnFunc(func(args ...any) any { ... })` |
| 91 | + |
| 92 | +### 4. AST Node Generation (codegen.go:361-408) |
| 93 | + |
| 94 | +Dispatches on `node.Op` to specialized generators: |
| 95 | +- Control flow nodes generate Go control structures |
| 96 | +- Expression nodes generate Go expressions |
| 97 | +- Special forms have custom handling |
| 98 | + |
| 99 | +## Variable Scope Management |
| 100 | + |
| 101 | +### Scope Stack (codegen.go:19-23, 696-741) |
| 102 | + |
| 103 | +```go |
| 104 | +type varScope struct { |
| 105 | + nextNum int // Counter for unique var names |
| 106 | + names map[string]string // Clojure name → Go var name |
| 107 | +} |
| 108 | +``` |
| 109 | + |
| 110 | +- Each let/fn/loop pushes new scope |
| 111 | +- Variables allocated as v0, v1, v2... |
| 112 | +- Scopes inherit counter from parent |
| 113 | + |
| 114 | +### Example Scoping |
| 115 | + |
| 116 | +```clojure |
| 117 | +(let [x 1] ; x → v0 |
| 118 | + (let [x 2 y 3] ; x → v1 (shadows), y → v2 |
| 119 | + (+ x y))) ; references v1, v2 |
| 120 | +``` |
| 121 | + |
| 122 | +## Loop/Recur Implementation |
| 123 | + |
| 124 | +### Recur Context (codegen.go:25-29) |
| 125 | + |
| 126 | +```go |
| 127 | +type recurContext struct { |
| 128 | + loopID *lang.Symbol // Matches recur to its loop |
| 129 | + bindings []string // Go variable names for rebinding |
| 130 | +} |
| 131 | +``` |
| 132 | + |
| 133 | +### Generated Pattern (codegen.go:589-614, 616-658) |
| 134 | + |
| 135 | +```go |
| 136 | +// (loop [x 0] ... (recur (inc x))) |
| 137 | +var v0 any = 0 |
| 138 | +for { |
| 139 | + // body... |
| 140 | + var recurTemp0 any = v0 + 1 // Evaluate recur args |
| 141 | + v0 = recurTemp0 // Rebind |
| 142 | + continue // Loop |
| 143 | +} |
| 144 | +``` |
| 145 | + |
| 146 | +## Testing Infrastructure |
| 147 | + |
| 148 | +### Test Harness (pkg/codegen/codegen_test.go) |
| 149 | + |
| 150 | +1. **Golden Files** (codegen_test.go:24-71): Compare generated output |
| 151 | + - Input: `testdata/*.glj` |
| 152 | + - Expected: `testdata/*.glj.expected` |
| 153 | + |
| 154 | +2. **Go Vet Validation** (codegen_test.go:207-223): Ensures valid Go syntax |
| 155 | + |
| 156 | +3. **Behavioral Tests** (codegen_test.go:72-172): Run generated code |
| 157 | + - Compiles to temporary binary |
| 158 | + - Executes -main function |
| 159 | + - Verifies output |
| 160 | + |
| 161 | +### Running Tests |
| 162 | + |
| 163 | +```bash |
| 164 | +# Run all codegen tests |
| 165 | +go test ./pkg/codegen/... |
| 166 | + |
| 167 | +# Update golden files |
| 168 | +go test ./pkg/codegen/... -update |
| 169 | + |
| 170 | +# Verbose output with generated code |
| 171 | +go test ./pkg/codegen/... -v |
| 172 | +``` |
| 173 | + |
| 174 | +## Extending the Codegen |
| 175 | + |
| 176 | +### Adding New AST Node Support |
| 177 | + |
| 178 | +1. Add case in `generateASTNode()` (codegen.go:361-408) |
| 179 | +2. Implement generator function following pattern: |
| 180 | + ```go |
| 181 | + func (g *Generator) generateNewOp(node *ast.Node) string { |
| 182 | + newOpNode := node.Sub.(*ast.NewOpNode) |
| 183 | + // Generate Go code... |
| 184 | + resultVar := g.allocateVar("result") |
| 185 | + g.writef("...") |
| 186 | + return resultVar |
| 187 | + } |
| 188 | + ``` |
| 189 | +3. Add test case in `testdata/` |
| 190 | +4. Run tests with `-update` to create expected output |
| 191 | + |
| 192 | +### Common Patterns |
| 193 | + |
| 194 | +**R-values vs Statements**: Generators return variable names (r-values) and emit statements to `g.w`: |
| 195 | +```go |
| 196 | +testExpr := g.generateASTNode(node.Test) // Get r-value |
| 197 | +g.writef("if lang.IsTruthy(%s) {\n", testExpr) // Use in statement |
| 198 | +``` |
| 199 | + |
| 200 | +**Temporary Variables**: Use `allocateVar()` for unique names: |
| 201 | +```go |
| 202 | +tempVar := g.allocateVar("temp") |
| 203 | +g.writef("%s := complexExpression()\n", tempVar) |
| 204 | +``` |
| 205 | + |
| 206 | +**Scope Management**: Always push/pop for new lexical scopes: |
| 207 | +```go |
| 208 | +g.pushVarScope() |
| 209 | +defer g.popVarScope() |
| 210 | +``` |
| 211 | + |
| 212 | +## Debugging Tips |
| 213 | + |
| 214 | +1. **Examine Generated Code**: Tests output generated code on failure |
| 215 | +2. **Check AST Structure**: Use `fmt.Printf("%#v\n", node)` to inspect |
| 216 | +3. **Trace Execution**: Add logging to generator methods |
| 217 | +4. **Validate Manually**: Copy generated code to test file and run |
| 218 | + |
| 219 | +## Integration Points |
| 220 | + |
| 221 | +### Runtime Compatibility |
| 222 | + |
| 223 | +Generated code uses same primitives as runtime: |
| 224 | +- `lang.Apply()` for function calls (pkg/lang/ifn.go:8-25) |
| 225 | +- `lang.IsTruthy()` for conditionals (pkg/lang/truthy.go:3-18) |
| 226 | +- `lang.NewList/Vector/Map()` for collections (pkg/lang/collections.go) |
| 227 | + |
| 228 | +### Namespace System |
| 229 | + |
| 230 | +Generated code integrates with runtime namespaces: |
| 231 | +- `lang.FindOrCreateNamespace()` (pkg/lang/namespace.go:340-350) |
| 232 | +- `ns.InternWithValue()` (pkg/lang/namespace.go:112-125) |
| 233 | +- Vars are accessible from REPL after loading |
| 234 | + |
| 235 | +## Future Directions |
| 236 | + |
| 237 | +1. **Full AST Coverage**: Implement remaining node types |
| 238 | +2. **Optimization**: Dead code elimination, constant folding |
| 239 | +3. **Integration**: Add `glj compile` command for AOT compilation |
| 240 | +4. **Performance**: Benchmark against runtime interpreter |
| 241 | +5. **Debugging**: Source maps for generated code |
| 242 | + |
| 243 | +## Related Files |
| 244 | + |
| 245 | +- **AST Definition**: pkg/ast/ast.go |
| 246 | +- **Analyzer**: pkg/compiler/analyze.go |
| 247 | +- **Runtime Evaluator**: pkg/runtime/evalast.go (comparison reference) |
| 248 | +- **Test Data**: pkg/codegen/testdata/*.glj |
| 249 | +- **Language Primitives**: pkg/lang/*.go |
0 commit comments