Skip to content

Latest commit

 

History

History
221 lines (140 loc) · 7.21 KB

File metadata and controls

221 lines (140 loc) · 7.21 KB

Compilation Targets Guide

This document provides detailed information about the three compilation targets supported by the Refal compiler: WebAssembly (WASM), eBPF, and LLVM.

WebAssembly (WASM)

Overview

WebAssembly is a binary instruction format designed as a portable target for compilation of high-level languages like C, C++, and Rust. It is designed to be executed in web browsers and other environments, providing near-native performance.

Implementation Details

Our Refal to WebAssembly compiler generates WebAssembly Text Format (WAT) files, which can be converted to binary WASM files using standard WebAssembly tools. The implementation focuses on:

  1. Memory Model: WebAssembly uses a linear memory model, which we adapt to represent Refal's tree-like data structures.
  2. Pattern Matching: We implement Refal's pattern matching using WebAssembly's control flow constructs.
  3. Function Calls: Function calls are implemented using WebAssembly's function table.

Runtime Support

The WebAssembly backend includes a small runtime library that provides:

  • Memory management for Refal expressions
  • Pattern matching operations
  • Implementation of built-in functions

Usage

To compile a Refal program to WebAssembly:

refal -i input.ref -t wasm -o output.wat

The resulting WAT file can be converted to binary WASM using tools like wat2wasm:

wat2wasm output.wat -o output.wasm

Execution

WebAssembly modules can be executed in various environments:

  • Web browsers using the WebAssembly JavaScript API
  • Node.js using WebAssembly modules
  • Standalone WebAssembly runtimes like Wasmtime or Wasmer

Limitations

  • Some Refal features may have performance implications in WebAssembly
  • The current implementation does not optimize for WebAssembly-specific features
  • Browser security restrictions may limit certain operations

eBPF (extended Berkeley Packet Filter)

Overview

eBPF is a technology that allows running sandboxed programs in the Linux kernel. It was originally designed for packet filtering but has evolved into a general-purpose execution environment for various use cases, including networking, security, and observability.

Implementation Details

Our Refal to eBPF compiler generates eBPF bytecode that can be loaded into the Linux kernel or used with eBPF virtual machines. The implementation addresses several challenges:

  1. Limited Instruction Set: eBPF has a restricted instruction set, requiring creative implementation of Refal's pattern matching.
  2. Memory Constraints: eBPF programs have limited memory access, requiring careful management of Refal's data structures.
  3. Verification Requirements: eBPF programs must pass a verifier, which imposes additional constraints on the generated code.

Runtime Support

The eBPF backend includes a minimal runtime that provides:

  • Memory management within eBPF constraints
  • Pattern matching operations adapted to eBPF's instruction set
  • Implementation of a subset of Refal's built-in functions

Usage

To compile a Refal program to eBPF:

refal -i input.ref -t ebpf -o output.ebpf

The resulting eBPF bytecode can be loaded into the kernel using tools like bpftool:

bpftool prog load output.ebpf /sys/fs/bpf/refal_program

Execution

eBPF programs can be executed in various contexts:

  • Attached to kernel hooks (e.g., network interfaces, syscalls)
  • Loaded into user-space eBPF virtual machines
  • Used with frameworks like BCC or libbpf

Limitations

  • eBPF has strict limitations on program size and complexity
  • Not all Refal features may be efficiently implementable in eBPF
  • The verifier may reject some valid programs due to its conservative analysis

LLVM (Low Level Virtual Machine)

Overview

LLVM is a collection of modular and reusable compiler and toolchain technologies. The LLVM Core libraries provide a modern source- and target-independent optimizer, along with code generation support for many popular CPUs.

Implementation Details

Our Refal to LLVM compiler generates LLVM Intermediate Representation (IR), which can be compiled to native code for various architectures. The implementation leverages LLVM's capabilities:

  1. Pattern Matching: We implement Refal's pattern matching using LLVM's control flow constructs and optimization passes.
  2. Memory Management: We use LLVM's memory operations for efficient representation of Refal's data structures.
  3. Optimization: We leverage LLVM's optimization passes to improve the performance of the generated code.

Runtime Support

The LLVM backend includes a comprehensive runtime library that provides:

  • Memory management with garbage collection
  • Efficient pattern matching operations
  • Implementation of all Refal built-in functions

Usage

To compile a Refal program to LLVM IR:

refal -i input.ref -t llvm -o output.ll

The resulting LLVM IR can be compiled to native code using llc and a C compiler:

llc output.ll -o output.s
gcc output.s -o output

Execution

LLVM-generated code can be executed as native binaries on the target platform.

Optimization Levels

The LLVM backend supports multiple optimization levels:

  • None: No optimizations, useful for debugging
  • Default: Standard optimizations for good performance
  • Aggressive: Maximum optimizations for best performance

To specify an optimization level:

refal -i input.ref -t llvm -l aggressive -o output.ll

Limitations

  • Some Refal features may not map efficiently to certain target architectures
  • Debugging information may be limited compared to native Refal implementations

Comparison of Targets

Feature WebAssembly eBPF LLVM
Performance Good Limited Excellent
Portability Excellent Linux-only Good
Sandboxing Yes Yes No
Memory Management Manual Restricted Flexible
Optimization Limited Minimal Extensive
Debugging Limited Difficult Good
Use Cases Web, edge computing Kernel, networking General-purpose

Advanced Usage

Cross-Compilation

For LLVM targets, you can cross-compile to different architectures:

refal -i input.ref -t llvm -o output.ll --target=aarch64-linux-gnu

Runtime Configuration

You can configure runtime behavior for specific targets:

refal -i input.ref -t wasm -o output.wat --wasm-memory-pages=10

Interoperability

Generated code can interoperate with other languages:

  • WebAssembly can be called from JavaScript
  • LLVM-generated code can be linked with C/C++ libraries
  • eBPF programs can interact with kernel data structures

Troubleshooting

Common Issues

  • Compilation Errors: Check your Refal syntax and ensure all functions are defined
  • Runtime Errors: Ensure pattern matching is exhaustive
  • Performance Issues: Try different optimization levels or compilation targets

Target-Specific Issues

  • WebAssembly: Check browser console for errors
  • eBPF: Use bpftool prog tracelog for debugging
  • LLVM: Examine generated IR for unexpected constructs

References