Skip to content

Define bounded lists#640

Open
Felix-El wants to merge 4 commits into
WebAssembly:mainfrom
Felix-El:main
Open

Define bounded lists#640
Felix-El wants to merge 4 commits into
WebAssembly:mainfrom
Felix-El:main

Conversation

@Felix-El

Copy link
Copy Markdown

Add bounded lists (list<T, ..N>)

Closes #385.

@cpetig cpetig left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly indentation doubts

Comment thread design/mvp/canonical-abi/definitions.py Outdated
Comment thread design/mvp/canonical-abi/definitions.py Outdated

@lukewagner lukewagner left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This is looking generally good, a few comments:

Comment thread design/mvp/canonical-abi/definitions.py Outdated
Comment thread design/mvp/canonical-abi/definitions.py Outdated
Comment thread design/mvp/canonical-abi/definitions.py Outdated
Comment thread design/mvp/canonical-abi/definitions.py Outdated
Comment thread design/mvp/canonical-abi/definitions.py Outdated
Comment thread design/mvp/canonical-abi/definitions.py Outdated
Comment thread design/mvp/canonical-abi/definitions.py Outdated
def flatten_list(elem_type, maybe_length, maybe_variable, opts):
if maybe_length is not None:
if maybe_variable:
return flatten_type(varint_type(maybe_length), opts) + flatten_type(elem_type, opts) * maybe_length

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pre-existing, but a question: for flattening bounded- or fixed-length lists: should we have some low (e.g., 4) per-value maximum flattening length, beyond which the list gets passed via pointer? I'm mostly thinking of the case where there is some list that has a large bound, and it gets used as a function parameter, and it "blows" the MAX_FLAT budget, causing all the parameters to go into the heap when instead you probably wanted the bounded-/fixed-length list to go in the heap.

@Felix-El Felix-El Apr 22, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my perspective (one who wants to reduce/avoid allocations) this would make the data type harder to reason about. Fixed and bounded lists should be guaranteed to be "static".
TBH I'd rather say that some datatypes (like these special lists) should enforce* the use of a param or return area for passing - because lists are indexed dynamically unlike tuples (-> indirect addressing works only on memory locations and compilers would otherwise have to compensate for such lists split across registers - pretty terrible).

(enforce: such lists should definitely be placed into param/return area but other types, if possible could continue to be passed by args... but yeah, this obviously complicates the flattening logic)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so then could we say that fixed- and bounded-length lists never get flattened, and are always passed as pointer (and maybe length if fixed)?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is what I mean (except: length if bounded, that could still be in a register).
I will explore how that logic extension could look (in Python) in the next commit.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed a proposal for hybrid lifting/lowering where register and memory passing can coexist.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flat-calling-convention treatment of fixed-length and bounded lists

Background

The canonical ABI flattens component-level values into Wasm core function
parameters. A MAX_FLAT cap (currently 16) limits how many flat slots a call may use before everything is
redirected through a param-area (a caller-allocated memory block whose
pointer is passed in r0).

The current implementation flattens fixed-length lists element-by-element
(list<T, N> → N flat slots) and bounded lists with a leading length register
(list<T, ..N> → 1 + N flat slots). This is wasteful in several ways.

A key property of fixed and bounded lists is that they can be passed without
additional allocations beyond the param area itself: because the maximum byte
count is statically known from the type, the caller can reserve space inline
rather than performing a separate heap allocation.

Why the current strategy is problematic

Semantic mismatch

List element access is indexed by a runtime value. Even if all elements
arrive in registers, the callee must immediately spill them to memory before it
can iterate or index, because registers are not addressable. Tuples do not
have this problem — each field is a compile-time-known register slot.

Passing list elements in registers therefore adds round-trip spill cost with no
benefit.

Budget poisoning

When the flat slots for a list exceed MAX_FLAT, the ABI falls back to the
classic all-params-to-param-area strategy: all arguments, including simple
scalars that would fit comfortably in registers, are bundled into the param
area and only a single pointer is passed. The list causes unrelated arguments
to pay an indirection penalty.

Example: f(list<s32, 16>, s32) needs 17 slots (16 elements + 1 scalar). With
MAX_FLAT=16 the scalar ends up in memory alongside the list — even though
there was a free register waiting for it.

Proposal

Allow list elements to occupy the param area without consuming flat-register slots.

  • Fixed list: 0 flat slots; list elements are stored in the param area
    (if no param area is required by other arguments, one is created and the
    elements occupy it starting at offset 0).
  • Bounded list: 0 flat slots; a varint length prefix followed by the list
    elements are stored in the param area (if no param area is required by other
    arguments, one is created, with the varint prefix at offset 0 followed by the
    elements).
  • After processing all typed arguments, if a param area is needed, one
    additional flat slot (param area address) is appended to the flat list.
  • If the total flat slots now exceed MAX_FLAT, fall back to the classic
    all-params-to-param-area strategy for backward compatibility.

Benefits: scalar arguments that fit within MAX_FLAT stay in registers even
when a list is present. Both fixed and bounded lists use the same in-memory
layout (elements for fixed, varint | elements for bounded) whether they
appear as top-level flat arguments or as elements of an outer fixed/bounded
list — the representation composes uniformly.
Cost: the extra param area address slot can tip the budget when many scalar arguments
accompany a list, at which point the classic fallback fires. Potentially we could excempt the param area pointer from MAX_FLAT budget, like result area pointer is.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably worth a separate discussion though I wanted to show a more natural way to deal with in-place lists.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I shy away from the added complexity of mixed flattened and in-memory arguments. Up to now either all arguments are in registers or in memory.

Also I just realized that for guest imported functions list<> doesn't have to be heap allocated, as it is passed by reference, similarly guest exported results (custom cabi_post can replace malloc/free with arenas). Only the two other cases require cabi_realloc (which could still be different from malloc): Guest import result and guest export arguments. So this representation is only applicable to guest export (host calls into guest) argument representation.


Also I don't want to derail this discussion with just another wild idea, but I have been thinking about using SIMD/vector encoding of fixed length lists into registers, e.g. passing an [u8; 8] in an u64.

I think with SIMD and shift instructions this is efficient enough to access/process and helps reducing excessive register grabbing in the flattened case. My guess is that [u8; 8] to [u8; 64] would be a common enough structure element to make this worthwhile.

Comment thread design/mvp/Binary.md
`none` case of an optional immediate.)
* 🔧 for fixed-sized lists the length of the list must be larger than 0 to pass
validation.
* 🔧 for fixed-sized lists (`0x67`) the length of the list must be larger than

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-existing, but it looks like the grammar already covers this twice: once by using <u32> (unsigned) and once with the (if maxlen > 0). We could also remove the (if maxlen > 0). But should we specify a maximum for maxlen?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For u32, zero is valid and the maxlen > 0 checks forbids zero. What would be a legitimate upper bound... i32::MAX?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already (just recently added) a MAX_LIST_BYTE_LENGTH = 228-1. That's just bytes, but even still, it seems like a reasonable upper bound.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, defined that bound in BINARY.md but where would we check this in definitions.py ?

Felix-El and others added 4 commits June 12, 2026 12:45
Cherry-picked the essence of cpetig's commit d2874eb
from https://github.com/cpetig/component-model/tree/bounded-lists, adapted to
the current codebase (ptr_type/opts threading, updated class names).

Bounded strings are intentionally excluded.

Co-authored-by: Christof Petig <christof.petig@arcor.de>
- Add trap_if(actual_len > maybe_length) to lift_flat_list, mirroring
  the existing trap in load_list's heap path
- Add over-length trap tests for both flat and heap lifting
- Add alignment test for bounded list of U32 (verifies 3-byte padding
  after U8 length prefix)
- fix memory bounds checking
- improve integration with existing list load/store code
- fix indentation
- avoid default argument
- more readable load/store recipe
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bounded lists and strings

4 participants