Haven is designed from the outset for interopability with C and other low-level languages.
In Haven, mutability is opt-in, not opt-out. Variables that you expect to modify must be annotated as mutable.
All functions are assumed pure unless explicitly annotated as impure. In practice, purity tracks observable side
effects such as impure calls and mutating or loading through references, cells, or boxes. Pure functions may still
perform ordinary local computation and local reassignment.
In Haven, identifiers:
- Must start with either an
_or a letter - Must end with a digit, letter, or
_ - Must only contain digits, letters,
_, or-
Hyphens (-) may be used only within an identifier:
-istrue // invalid, cannot start with hyphen
istrue- // invalid, cannot end with hyphen
is-true // valid
Note that the - operator for arithmetic requires spaces around it when used with two identifiers:
abc-def // identifier abc-def
abc - def // subtract the value of def from abc
To type a variable as a signed integer N bits wide, use iN:
i32defines a 32-bit signed integeri8defines an 8-bit signed integer
For an unsigned integer, use a u prefix instead of i.
Use the type float for floating-point numbers.
Haven offers a fvecN type defining a vector of floating point numbers.
Vectors can be used with binary expressions and optimize to parallel arithmetic where available on the target machine.
For example, the following function returns a new vector with the result of element-wise addition of the two input vectors.
pub fn vector_add(fvec3 a, fvec3 b) -> fvec3 {
a + b
}
Tip
When integrating Haven with C, fvecN is the equivalent of (non-standard) typedef float floatN __attribute__((vector_size(sizeof(float)) * N)).
Functions may accept vectors of any concrete dimension using fvec?.
These are not runtime-sized vectors. Instead, the compiler specializes the function at each call site using the concrete argument types from that call.
fn vadd(fvec? a, fvec? b) {
@assert a.dim == b.dim, "vector dimensions must match";
a + b
}
Inside such a function:
a.dimis a compile-time property of the specialized vector type- omitted return types are inferred after specialization
@assertconditions must reduce to compile-time constants after specialization
If a compile-time assertion fails, the compiler reports both the original condition and the specialized one:
semantic: error: vector dimensions must match
compile-time assertion failed: a.dim == b.dim
specialized as: 3 == 2
Haven offers matMxN matrix types of floating point numbers.
Like vectors, matrices support specialization holes in function signatures through mat?.
fn mat-width(mat? m) {
m.cols
}
fn get-mat-row(mat? m, u32 row) {
m[row]
}
Inside specialized matrix functions:
m.rowsandm.colsare compile-time properties- indexing a concrete
matMxNyields anfvecN - specialized functions are cloned before LLVM lowering, so hole types do not reach IR
The str type carries string data. It is essentially a const char * under the hood.
You may define your own aliases for types:
type int = i32;
These aliases are fully erased during compilation and are not accessible at runtime.
Defining a structured type looks similar to defining a type alias:
type Point = struct {
i32 x;
i32 y;
i32 z;
};
Structures may contain pointers to their own type:
type Node = struct {
i32 value;
Node *next;
};
Initializing a structure in a variable declaration requires an explicit type annotation:
let Node node = { 1234, nil };
In contexts where the type is known (e.g. a function return), the type will be inferred automatically.
Single-element structs do not require a trailing comma:
let Thing thing = { 1234 };
You may define an enum type using two forms.
The first form simply defines a set of names:
type Number = enum {
One,
Two
};
The second form allows for creating union types with bindings:
type Numeric = enum {
Int(i32),
Float(float)
};
Use of enums in expressions requires both the enum name and the field name to be provided:
match x {
Numeric::Int(_) => 0,
Numeric::Float(_) => 1
}
There is limited support for templating enum types in Haven:
type Result = enum <T> {
Ok(T),
Error
};
fn thing() -> Result::<i32> {
Result::<i32>::Ok(5)
}
Arrays can be defined by adding a dimension to a type.
i32[2] arr = {
0,
1
};
A single-element array requires a trailing ,:
i32[1] arr = {
0,
};
Caution
Boxed types are very much under construction. Their definition may yet change, and they tend to have rough edges that lead to bugs at runtime in their current form.
Boxing wraps a value in a heap-allocated structure. The underlying value
can be retrieved with the unbox keyword.
fn example() -> i32 {
let mut val = box 5; // i4^
let result = unbox val; // i4
val = nil; // box is freed
result
}
box also supports type-directed construction:
type Buffer = struct {
i32 len;
i32 cap;
};
extend Buffer with {
construct(i32 cap) {
self->len = 0;
self->cap = cap;
}
}
fn example() -> i32 {
let mut boxed = box Buffer(64);
let value = unbox boxed;
boxed = nil;
value.cap
}
box T allocates storage for T, recursively default-initializes its members, and then runs a zero-argument
constructor if T defines one. box T(args...) performs the same recursive default initialization, then calls
construct with the supplied arguments. This means member pointers are nil and inline subobjects are already in a
known state before construct runs.
Box types are written much like pointers, but using a caret (^) instead
of an asterisk (*):
fn example(i32^ boxed) -> i32;
To directly mutate the value of a box, use the := mutation
operator:
let val = box 5;
val := 6;
let result = unbox val; // 6
Note that val does not need to be mutable in this case. let mut permits reassignment of val but does not
control the mutability of the stored value. In the example above, := would be very similar to *val = 6 in C.
Import declarations may only appear at the file scope. An import loads the contents of the imported file, allowing definitions from that file to be used locally.
import "vec.hv";
A C import declaration parses a C header file and retains declarations for the purpose of C interopability.
You need to pass --bootstrap to the compiler as C imports are currently primarily implemented for the
compiler bootstrap phases. They may become more readily available once a few ergonomics issues are worked out.
cimport "stdio.h";
When introducing dependencies on external libraries, you may opt to use the --Xl -lm style of command line
flag to present the correct libraries for linking.
Alternatively, Haven offers the foreign declaration to simplify this end-to-end:
foreign "m" {
fn fsqrtf(float x) -> float;
}
foreign "c" {
fn printf(str format, *) -> i32;
}
A module with these foreign declarations will automatically add -lm -lc to the command line. The function
declarations will also be automatically marked pub and impure, simplifying the declarations for import.
Type declarations (type X = ...) may only appear at the file scope.
Type extensions attach behavior to an existing type without changing its layout.
extend Buffer with {
construct(i32 cap) {
self->len = 0;
self->cap = cap;
}
destruct {
self->len = 0;
}
}
In the current model, extend is behavioral only:
construct(...) { ... }defines an optional constructor hook.destruct { ... }defines an optional destructor hook.selfis provided implicitly inside both hooks and is pointer-like, so fields are accessed withself->field.
Constructors run automatically after recursive default-initialization. Destructors run automatically when the final
boxed reference is released. extend does not add fields or change ABI layout.
File scope variables are split into two categories:
data, for constant, immutable data used by the program without modification, andstate, for mutable program state that may be initialized either from a constant expression or from startup code
data i32 x = 1234; // constant, local
pub data i32 y = 5678; // constant, with global linkage (visible outside the translation unit)
state i32 x = 1234; // mutable, local
pub state i32 y = 5678; // mutable, global linkage
For pub data and state, an initializer may be ommitted to create a reference to be resolved by the linker.
Non-constant global initializers are lowered to program startup initialization before user code runs.
Inside function definitions, variable declarations take a different form:
let [mut] [<ty>] <ident> = <init-expr>;
A type need not be specified. If unspecified, the type of the variable will be inferred from the initialization expression. Specifying mut will allow reassignment of the variable.
Variables at function scope must be initialized.
Functions can be forward-declared without a body.
[pub] [impure] fn <ident>(<arg-list>) -> <ret-ty>;
[pub] [impure] fn <ident>(<arg-list>) -> <ret-ty> { <body> }
Specifying pub on declarations that have no definitions will create an external reference to the function.
Specifying impure on declarations will mark the function as impure, which means it is allowed to read and write memory.
An argument list can be ended with * to indicate that the function accepts a variable number of arguments:
pub fn printf(str format, *) -> i32;
Warning
Pure functions cannot call impure functions.
The following example shows usage of both a declaration and a defined function:
pub fn printf(str fmt, *) -> i32;
pub fn main() -> i32 {
printf("Hello, world!\n");
0
}
Blocks contain statements and expressions. Every function definition has at least one block. Defining a block creates a new scope: variables defined before a block begins are visible, but variables defined inside the block are not visible outside the block.
If the final statement in a block is an expression, the result of that expression is used as the result value of the block. In functions, this result value becomes the return value of the function.
Blocks are themselves expressions, and can appear anywhere that an expression is expected:
let x = {
5 + 5
};
Note that the addition in this example is not terminated with a semicolon. Terminating with a semicolon would convert the block's result to be void, thereby making it an invalid initializer.
Any expression is also a valid statement. The last expression in a block must not be terminated with a semicolon.
An empty statement is also called a "void" statement. It has no effect and is omitted in code generation.
The let statement defines new variables in the current scope:
let test = 5;
let mut mutable = 6;
let i32 typed = 7;
The iter statement iterates over a range.
iter 0:10 i {
printf("%d\n", i);
};
Ranges are inclusive; the above range will visit values 0 and 10 during iteration.
A constant step can be provided:
iter 10:0:-1 i {};
The while statement loops as long as a condition is true-ish:
while 1 {
// ...
};
The until statement is sugar for while !cond:
until done {
// ...
};
The := operator mutates the value referenced by a pointer, cell, or box:
ptr := 5;
The equivalent syntax in C would be *ptr = 5.
The ret statement sets the return value for the function and immediately returns to its caller.
ret <value>;
The defer statement defers the execution of an expression to run right before the current function returns.
On function exit, deferred expressions run before the compiler's ownership cleanup for that scope.
In this example, the string "hello from defer" is printed after the string "Hello, world!". defer can be used anywhere within a function and can be very useful for memory and error management.
pub fn printf(str fmt, *) -> i32;
pub fn main() -> i32 {
defer printf("hello from defer\n");
printf("Hello, world!\n");
as i32 0
}
A constant value can be used anywhere that an expression is expected:
let integer = 5;
let number = 5.0;
let text = "hello";
let vec = Vec<1.0, 2.0, 3.0>;
let s obj = { 1, 2, 3 };
let foo = Numbers::One;
See Blocks for more about blocks.
<expr> <op> <expr>
| Operator | Purpose | Precedence |
|---|---|---|
|| |
Logical OR | 5 |
&& |
Logical AND | 10 |
| |
Bitwise OR | 15 |
^ |
Bitwise XOR | 20 |
& |
Bitwise AND | 25 |
== != |
Boolean Equal, Boolean Not Equal | 30 |
< <= > >= |
Boolean Inequalities | 35 |
<< >> |
Bitwise Shifts | 40 |
+ - |
Addition, Subtraction | 45 |
* / % |
Multiplication, Division, Modulo | 50 |
Note that parenthesis (( )) may be used to control order of operations.
Logical operators (|| and &&) short-circuit their operation:
- if the left side is true-ish, and the operator is
&&, the right side will not be evaluated - if the left side is false-ish, and the operator is
||, the right side will not be evaluated
Any variable in scope may be used in an expression. Its value at the time of expression evaluation will be used.
let x = struct_var.x;
let x = array_var[5];
Vectors can be dereferenced using xyzw or rgba letters, or a digit.
let x = vec.x; // 1st element
let a = vec.a; // 4th element
let v = vec.5; // 5th element
Functions may be called using parentheses:
let result = my_function(1, 2, 3);
Use the as syntax to cast between types:
let x = as<i32>(5);
let x = !0; // 1
let y = 3 ^ 1; // 2
let z = ~0; // (all bits set to one)
if may be used as an expression to select between two values. Both the then and else expressions must resolve to the same type. An else is not optional in this context.
let sign = if x >= 0 { 0 } else { 1 };
![NOTE] The braces are not required for an
ifexpression. It is legal to use any expression, including those not wrapped in a block, in thethenorelseblocks of anifexpression.
When used as a statement, if does not require its blocks to have identical types, and an else is not required.
if x >= 0 {
// do things
} else {
// do other things
};
To create a pointer to an existing variable or object, use the ref keyword:
let node tail = { 1, nil };
let node head = { 0, ref tail };
nil may be used in lieu of a reference to indicate NULL.
To read the contents of a pointer created using ref, use load:
{
let node = load head.next;
node.value
}
match provides the main pattern matching syntax for Haven.
This variant simply evaluates comparisons between the condition and the arms of the match, returning the expression that matches.
let v = match 5 {
5 => 0,
4 => { 2 + 2 }, // any expression is valid
_ => 1
};
let v = match number(2) {
Numbers::Two => 0,
_ => 1
}
It is an error to not provide a binding if the enum value includes a binding. The _ binding value allows you to explicitly opt-out of binding.
let v = match numeric(0) {
Numeric::Int(x) => x, // x is defined for the duration of the expression
Numeric::Float(_) => 0, // you may opt out of binding
_ => 10
};
This builtin offers the size of types and expressions as a constant. It will always resolve to a size that is constant at compile-time, and will emit error diagnostics if this size cannot be determined.
To use the size of a type as a constant, use the size<T> syntax variant:
let sz = size<i32>; // 4
To use the sixe of an expression's result as a constant, use the size(...) syntax variant:
let i32 x = 1234;
let sz = size(x);
Haven compiles to LLVM IR, allowing use of a wide range of LLVM instrinsics.
To declare a function that maps to an intrinsic, use the intrinsic keyword in the declaration after the
return type annotation. Following the keyword, add the name of the intrinsic as a string. A comma-separated
list of parameter types follows to help LLVM identify the correct intrinsic variant to use.
For example, the following declarations declare the LLVM sqrt and powi intrinsics for the program:
pub fn __builtin_ipow(float x, i32 power) -> float intrinsic "llvm.powi" float, i32;
pub fn __builtin_sqrtf(float x) -> float intrinsic "llvm.sqrt" float;