Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
206 changes: 206 additions & 0 deletions languages/tolk/features/asm-functions.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,206 @@
---
title: "Assembler functions"
---

Functions in Tolk may be defined using assembler code. It's a low-level feature that requires understanding of stack layout, [Fift](/languages/fift/overview), and [TVM](/tvm/overview).

## Standard functions are `asm` wrappers

Many functions from [standard library](/languages/tolk/features/standard-library) are translated to Fift assembler directly.

For example, TVM has a `HASHCU` instruction: "calculate hash of a cell". It pops a cell from the stack and pushes an integer in the range 0 to 2<sup>256</sup> − 1. Therefore, the method `cell.hash` is defined this way:

```tolk
@pure
fun cell.hash(self): uint256
asm "HASHCU"
```

The type system guarantees that when this method is invoked, a TVM `CELL` will be the topmost element (`self`).

## Custom functions are declared in the same way

```tolk
@pure
fun incThenNegate(v: int): int
asm "INC" "NEGATE"
```

A call `incThenNegate(10)` will be translated into those commands.

A good practice is to specify `@pure` if the body does not modify TVM state or throw exceptions.

The return type for `asm` functions is mandatory. For regular functions, it's auto-inferred from `return` statements.

## Multi-line asm

To embed a multi-line command, use triple quotes:

```tolk
fun hashStateInit(code: cell, data: cell): uint256 asm """
DUP2
HASHCU
...
ONE HASHEXT_SHA256
"""
```

It is treated as a single string and inserted as-is into Fift output. In particular, it may contain `//` comments inside; valid comments for Fift.

## Stack order for multiple slots

When calling a function, arguments are pushed in a declared order. The last parameter becomes the topmost stack element.

If an instruction results in several slots, the resulting type should be a tensor or a struct.

For example, write a function `abs2` that calculates `abs()` for two values at once: `abs2(-5, -10)` = `(5, 10)`. Stack layout, the right is the top, is written in comments.

```tolk
fun abs2(v1: int, v2: int): (int, int)
asm // v1 v2
"ABS" // v1 v2_abs
"SWAP" // v2_abs v1
"ABS" // v2_abs v1_abs
"SWAP" // v1_abs v2_abs
```

## Rearranging arguments on the stack

Sometimes a function accepts parameters in an order different from what a TVM instruction expects. For example, `GETSTORAGEFEE` expects the order "cells bits seconds workchain". But for more clear API, workchain should be passed first. Stack positions can be reordered via the `asm(...)` syntax:

```tolk
fun calculateStorageFee(workchain: int8, seconds: int, bits: int, cells: int): coins
asm(cells bits seconds workchain) "GETSTORAGEFEE"
```

Similarly, for return values. If multiple slots are returned, and they must be reordered to match typing, use `asm(-> ...)` syntax:

```tolk
fun asmLoadCoins(s: slice): (slice, int)
asm(-> 1 0) "LDVARUINT16"
```

Both the input and output sides may be combined: `asm(... -> ...)`. Reordering is mostly used with `mutate` variables.

## `mutate` and `self` in assembler functions

The `mutate` keyword (see [mutability](/languages/tolk/syntax/mutability)) works by implicitly returning new values via the stack — both for regular and `asm` functions.

For better understanding, let's look at regular functions first. The compiler does all transformations automatically:

```tolk
// transformed to: "returns (int, void)"
fun increment(mutate x: int): void {
x += 1;
// a hidden "return x" is inserted
}

fun demo() {
// transformed to: (newX, _) = increment(x); x = newX
increment(mutate x);
}
```

How to implement `increment()` via asm?

```tolk
fun increment(mutate x: int): void
asm "INC"
```

The function still returns `void` from the type system's perspective it does not return a value, but `INC` leaves a number on the stack — that's a hidden "return x" from a manual variant.

Similarly, it works for `mutate self`. An `asm` function should place `newSelf` onto the stack before the actual result:

```tolk
// "TPUSH" pops (tuple) and pushes (newTuple);
// so, newSelf = newTuple, and return `void` (syn. "unit")
fun tuple.push<X>(mutate self, value: X): void
asm "TPUSH"

// "LDU" pops (slice) and pushes (int, newSlice);
// with `asm(-> 1 0)`, we make it (newSlice, int);
// so, newSelf = newSlice, and return `int`
fun slice.loadMessageFlags(mutate self): int
asm(-> 1 0) "4 LDU"
```

To return `self` for chaining, just specify a return type:

```tolk
// "STU" pops (int, builder) and pushes (newBuilder);
// with `asm(op self)`, we put arguments to correct order;
// so, newSelf = newBuilder, and return `void`;
// but to make it chainable, `self` instead of `void`
fun builder.storeMessageOp(mutate self, op: int): self
asm(op self) "32 STU"
```

## `asm` is compatible with structures

Methods for structures may also be declared as assembler ones knowing the layout: fields are placed sequentially. For instance, a struct with one field is identical to this field.

```tolk
struct MyCell {
private c: cell
}

@pure
fun MyCell.hash(self): uint256
asm "HASHCU"
```

Similarly, a structure may be used instead of tensors for returns. This is widely practiced in `map<K, V>` methods over TVM dictionaries:

```tolk
struct MapLookupResult<TValue> {
private readonly rawSlice: slice?
isFound: bool
}

@pure
fun map<K, V>.get(self, key: K): MapLookupResult<V>
builtin
// it produces `DICTGET` and similar, which push
// (slice -1) or (null 0) — the shape of MapLookupResult
```

## Generics in `asm` should be single-slot

Take `tuple.push` as an example. The `TPUSH` instruction pops `(tuple, someVal)` and pushes `(newTuple)`. It should work with any `T`: int, int8, slice, etc.

```tolk
fun tuple.push<T>(mutate self, value: T): void
asm "TPUSH"
```

A reasonable question: how should `t.push(somePoint)` work? The stack would be misaligned, because `Point { x, y }` is not a single slot. The answer: this would not compile.

```ansi
dev.tolk:6:5: error: can not call `tuple.push<T>` with T=Point, because it occupies 2 stack slots in TVM, not 1

// in function `main`
6 | t.push(somePoint);
| ^^^^^^
```

Only regular and built-in generics may be instantiated with variadic type arguments, `asm` cannot.

## Do not use `asm` for micro-optimizations

Introduce assembler functions only for rarely-used TVM instructions that are not covered by stdlib. For example, when manually parsing merkle proofs or calculating extended hashes.

However, attempting to micro-optimize with `asm` instead of writing straightforward code is not desired. The compiler is smart enough to generate optimal bytecode from consistent logic. For instance, it automatically inlines simple functions, so create one-liner methods without any worries about gas:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH] Marketing-style language about compiler behavior

The sentence “The compiler is smart enough to generate optimal bytecode from consistent logic.” uses subjective, marketing-style phrasing (“smart enough”, “optimal bytecode”) rather than a precise, factual description. The style guide explicitly discourages promotional language and vague superlatives in technical documentation, as they can be misleading about guarantees and obscure the concrete behaviors being documented (see https://github.com/ton-org/docs/blob/main/contribute/style-guide-extended.mdx?plain=1#L35-L40). In this context, the text should focus on the specific optimizations (such as inlining and store merging) that the compiler performs.

Suggested change
However, attempting to micro-optimize with `asm` instead of writing straightforward code is not desired. The compiler is smart enough to generate optimal bytecode from consistent logic. For instance, it automatically inlines simple functions, so create one-liner methods without any worries about gas:
However, attempting to micro-optimize with `asm` instead of writing straightforward code is not desired. The compiler generates efficient bytecode from consistent logic and automatically inlines simple functions, so one-liner methods remain gas-efficient:

Please leave a reaction 👍/👎 to this suggestion to improve future reviews for everyone!


```tolk
fun builder.storeFlags(mutate self, flags: int): self {
return self.storeUint(32, flags);
}
```

The function above is better than "manually optimized" as `32 STU`. Because:

- it is inlined automatically
- for constant `flags`, it's merged with subsequent stores into `STSLICECONST`

See [compiler optimizations](/languages/tolk/features/compiler-optimizations).
7 changes: 7 additions & 0 deletions languages/tolk/features/compiler-optimizations.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
title: "Compiler optimizations"
---

<Stub
issue="1128"
/>
8 changes: 8 additions & 0 deletions languages/tolk/features/standard-library.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
title: "Standard library of Tolk"
sidebarTitle: "Standard library"
---

<Stub
issue="1128"
/>
7 changes: 7 additions & 0 deletions languages/tolk/syntax/mutability.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
title: "Mutability"
---

<Stub
issue="1128"
/>
Loading