At first, I thought since I already have a stack VM implemented to run BASIC programs, how hard it is to translate it into web assembly code? Turns out, a lot harder than I expected.
Main challenges are two folds:
- My interpreter VM relies on rust implementations for array allocation and access,
wasmon the other hand is much lower level, requires everything to be expressed withf64.loadandf64.store wasmonly allows reducible control flow, and even a trivial can exhibit irreducible structures (jump inside a subroutine)
It took quite a bit of effort, as I was neither familiar with compilers nor web assembly details. Although after this project, I feel I have gained quite a lot understandings for web assembly, as it used almost all aspect of current wasm features (basic arithmetics, memory, tables, data segment, dynamic dispatch etc.).
At a high level, BASIC concepts are translated into:
- Variable:
wasmglobal variable (mutable), possibly demoted to local variables if possible - Subroutine:
wasmfunctions DEF: currently not supportedPRINT: api provided byjsaswasmfunction imports- list / table:
wasmmemory access, implemented using sixwasmfunction compiled from unsafe rust (signatures inrustsyntax, pointers are compiled toi32values)fn alloc1d(next_ptr: *mut *mut u8, size: i32) -> *mut Array<i32>fn alloc2d(next_ptr: *mut *mut u8, nrow: i32, ncol: i32) -> *mut Array<[i32, i32]>fn load1d(ptr: *mut Array<i32>, index: i32) -> f64fn load2d(ptr: *mut Array<[i32, i32]>, i: i32, j: i32) -> f64fn store1d(ptr: *mut Array<i32>, index: i32, value: f64)fn store2d(ptr: *mut Array<[i32, i32]>, i: i32, j: i32, value: f64)Array<Stride>is an opaque data type for tracking allocation metadata and converting indexes tof64pointers
- Requires array dimensions to be specified via
DIMbefore usage (different from BASIC spec where list defaults to size 10 and tables 10x10) - Using arrays is quite unsafe as of now, there are no bound checks, and using uninitialized array pointers (they are just
i32at runtime) is permitted; running in a browser environment with bad programs will either produce hard to debug logical errors orwasmruntime exceptions DATAstatements are compiled intowasmdatasegments and placed intowasmlinear memory location 0- A global variable is used to track how much data has been read (starts at the end of data section), decrement by 8 (sizeof
f64) for every read. When it reaches zero, awasmunreachabletrap is raised
- A global variable is used to track how much data has been read (starts at the end of data section), decrement by 8 (sizeof
- labels used in
PRINT(eg.PRINT "Hello World") are also added todatasegments - Next memory location after data and labels is used for book keeping by the array allocation machinery (far from a fully fledged memory allocator, as it does not perform
free)
The heavy lifting for wasm codegen is done by binaryen. Only the the frontend (ast) from main basic_rs crate is reused for the wasm compiler (I thought I could reuse the stack VM IR there...).
I have made a new form of IR closely matching semantics of binaryen IR, with these main features:
-
A notion of functions
- A single
mainfunction for the main body of the program - Subroutines are analyzed and extracted into separated functions
- Also models
DEFfunction and has the concept of function pointers for dynamic dispatching (I have not implementedwasmcodegen for these yet)
- A single
-
FORloops are compiled to jump and conditional jumps -
Statements are grouped into basic blocks, and forms a control flow graph (possibly reducible, some really bad ones are rejected here, but I did not spent to much efforts on control flow analysis part, and frankly the code is a bit too messy for myself to fully understand...)
This simple program for a Monte Carlo estimation of PI:
100 REM Monte Carlo PI
110 LET C = 0
120 LET N = 1000
230 FOR I = 1 TO N
240 LET X = RND(X)
250 LET Y = RND(X)
260 LET D = X * X + Y * Y
270 IF D > 1 THEN 290
280 LET C = C + 1
290 NEXT I
300 LET P = 4 * C / N
310 PRINT "PI =", P, C"/"N
320 ENDProduces:
(module
(type $1 (func (param f64)))
(type $2 (func (param i32 i32)))
(type $3 (func))
(type $4 (func (result f64)))
(memory $0 1 1)
(data (i32.const 0) "PI =/")
(import "env" "print" (func $print (param f64)))
(import "env" "printLabel" (func $printLabel (param i32 i32)))
(import "env" "printNewline" (func $printNewline))
(import "env" "printAdvance15" (func $printAdvance15))
(import "env" "rand" (func $rand (result f64)))
(global $data_end (mut i32) (i32.const 0))
(export "data" (memory $0))
(export "main" (func $main))
(func $main (; 5 ;) (; has Stack IR ;) (type $3)
(local $0 f64)
(local $1 f64)
(local $2 f64)
(set_global $data_end
(i32.const 0)
)
(i32.store
(i32.const 8)
(i32.const 16)
)
(set_local $1
(f64.const 1)
)
(block $block$6$break
(loop $shape$3$continue
(br_if $block$6$break
(f64.gt
(f64.sub
(get_local $1)
(f64.const 1e3)
)
(f64.const 0)
)
)
(if
(i32.eqz
(f64.gt
(f64.add
(f64.mul
(tee_local $2
(call $rand)
)
(get_local $2)
)
(f64.mul
(tee_local $2
(call $rand)
)
(get_local $2)
)
)
(f64.const 1)
)
)
(set_local $0
(f64.add
(get_local $0)
(f64.const 1)
)
)
)
(set_local $1
(f64.add
(get_local $1)
(f64.const 1)
)
)
(br $shape$3$continue)
)
)
(call $printLabel
(i32.const 0)
(i32.const 4)
)
(call $printAdvance15)
(call $print
(f64.div
(f64.mul
(f64.const 4)
(get_local $0)
)
(f64.const 1e3)
)
)
(call $printAdvance15)
(call $print
(get_local $0)
)
(call $printLabel
(i32.const 4)
(i32.const 1)
)
(call $print
(f64.const 1e3)
)
(call $printNewline)
)
)For a bigger example with subroutines and arrays, checkout ./exmaple directory.