This document provides detailed information about the three compilation targets supported by the Refal compiler: WebAssembly (WASM), eBPF, and LLVM.
WebAssembly is a binary instruction format designed as a portable target for compilation of high-level languages like C, C++, and Rust. It is designed to be executed in web browsers and other environments, providing near-native performance.
Our Refal to WebAssembly compiler generates WebAssembly Text Format (WAT) files, which can be converted to binary WASM files using standard WebAssembly tools. The implementation focuses on:
- Memory Model: WebAssembly uses a linear memory model, which we adapt to represent Refal's tree-like data structures.
- Pattern Matching: We implement Refal's pattern matching using WebAssembly's control flow constructs.
- Function Calls: Function calls are implemented using WebAssembly's function table.
The WebAssembly backend includes a small runtime library that provides:
- Memory management for Refal expressions
- Pattern matching operations
- Implementation of built-in functions
To compile a Refal program to WebAssembly:
refal -i input.ref -t wasm -o output.wat
The resulting WAT file can be converted to binary WASM using tools like wat2wasm:
wat2wasm output.wat -o output.wasm
WebAssembly modules can be executed in various environments:
- Web browsers using the WebAssembly JavaScript API
- Node.js using WebAssembly modules
- Standalone WebAssembly runtimes like Wasmtime or Wasmer
- Some Refal features may have performance implications in WebAssembly
- The current implementation does not optimize for WebAssembly-specific features
- Browser security restrictions may limit certain operations
eBPF is a technology that allows running sandboxed programs in the Linux kernel. It was originally designed for packet filtering but has evolved into a general-purpose execution environment for various use cases, including networking, security, and observability.
Our Refal to eBPF compiler generates eBPF bytecode that can be loaded into the Linux kernel or used with eBPF virtual machines. The implementation addresses several challenges:
- Limited Instruction Set: eBPF has a restricted instruction set, requiring creative implementation of Refal's pattern matching.
- Memory Constraints: eBPF programs have limited memory access, requiring careful management of Refal's data structures.
- Verification Requirements: eBPF programs must pass a verifier, which imposes additional constraints on the generated code.
The eBPF backend includes a minimal runtime that provides:
- Memory management within eBPF constraints
- Pattern matching operations adapted to eBPF's instruction set
- Implementation of a subset of Refal's built-in functions
To compile a Refal program to eBPF:
refal -i input.ref -t ebpf -o output.ebpf
The resulting eBPF bytecode can be loaded into the kernel using tools like bpftool:
bpftool prog load output.ebpf /sys/fs/bpf/refal_program
eBPF programs can be executed in various contexts:
- Attached to kernel hooks (e.g., network interfaces, syscalls)
- Loaded into user-space eBPF virtual machines
- Used with frameworks like BCC or libbpf
- eBPF has strict limitations on program size and complexity
- Not all Refal features may be efficiently implementable in eBPF
- The verifier may reject some valid programs due to its conservative analysis
LLVM is a collection of modular and reusable compiler and toolchain technologies. The LLVM Core libraries provide a modern source- and target-independent optimizer, along with code generation support for many popular CPUs.
Our Refal to LLVM compiler generates LLVM Intermediate Representation (IR), which can be compiled to native code for various architectures. The implementation leverages LLVM's capabilities:
- Pattern Matching: We implement Refal's pattern matching using LLVM's control flow constructs and optimization passes.
- Memory Management: We use LLVM's memory operations for efficient representation of Refal's data structures.
- Optimization: We leverage LLVM's optimization passes to improve the performance of the generated code.
The LLVM backend includes a comprehensive runtime library that provides:
- Memory management with garbage collection
- Efficient pattern matching operations
- Implementation of all Refal built-in functions
To compile a Refal program to LLVM IR:
refal -i input.ref -t llvm -o output.ll
The resulting LLVM IR can be compiled to native code using llc and a C compiler:
llc output.ll -o output.s
gcc output.s -o output
LLVM-generated code can be executed as native binaries on the target platform.
The LLVM backend supports multiple optimization levels:
- None: No optimizations, useful for debugging
- Default: Standard optimizations for good performance
- Aggressive: Maximum optimizations for best performance
To specify an optimization level:
refal -i input.ref -t llvm -l aggressive -o output.ll
- Some Refal features may not map efficiently to certain target architectures
- Debugging information may be limited compared to native Refal implementations
| Feature | WebAssembly | eBPF | LLVM |
|---|---|---|---|
| Performance | Good | Limited | Excellent |
| Portability | Excellent | Linux-only | Good |
| Sandboxing | Yes | Yes | No |
| Memory Management | Manual | Restricted | Flexible |
| Optimization | Limited | Minimal | Extensive |
| Debugging | Limited | Difficult | Good |
| Use Cases | Web, edge computing | Kernel, networking | General-purpose |
For LLVM targets, you can cross-compile to different architectures:
refal -i input.ref -t llvm -o output.ll --target=aarch64-linux-gnu
You can configure runtime behavior for specific targets:
refal -i input.ref -t wasm -o output.wat --wasm-memory-pages=10
Generated code can interoperate with other languages:
- WebAssembly can be called from JavaScript
- LLVM-generated code can be linked with C/C++ libraries
- eBPF programs can interact with kernel data structures
- Compilation Errors: Check your Refal syntax and ensure all functions are defined
- Runtime Errors: Ensure pattern matching is exhaustive
- Performance Issues: Try different optimization levels or compilation targets
- WebAssembly: Check browser console for errors
- eBPF: Use
bpftool prog tracelogfor debugging - LLVM: Examine generated IR for unexpected constructs