Skip to content

[Runtime] Migrate from high-level iree_runtime API to low-level VM/HAL API#160

Open
sjain-stanford wants to merge 22 commits intomainfrom
sambhav/runtime_low_level_api_refactor
Open

[Runtime] Migrate from high-level iree_runtime API to low-level VM/HAL API#160
sjain-stanford wants to merge 22 commits intomainfrom
sambhav/runtime_low_level_api_refactor

Conversation

@sjain-stanford
Copy link
Member

@sjain-stanford sjain-stanford commented Feb 14, 2026

Summary

Fixes #82.
Fixes #15.

Replace the high-level iree_runtime_* wrapper layer with direct iree_vm_* and iree_hal_* APIs for precise lifetime control and device management.

Key type mapping

Removed (high-level) Replaced with (low-level)
iree_runtime_instance_t iree_vm_instance_t + global driver registry
iree_runtime_session_t iree_vm_context_t with explicit HAL/bytecode module registration
iree_runtime_call_t iree_vm_list_t + iree_vm_invoke

Major changes

  • VM/HAL API migration (runtime.h, backend.h, handle.h, graph.h): Replace all iree_runtime_* types and calls with their low-level equivalents across instance creation, device creation, context setup, bytecode loading, and function invocation (addresses [Core] Move away from highlevel iree_runtime_* API #82).
  • HAL driver registration (runtime.h): Extract driver registration into registerHalDriversOnce() using std::call_once, since the global driver registry persists across VM instance lifetimes and does not support re-registration.
  • Function caching (runtime.h, graph.h): Cache resolved iree_vm_function_t in Graph::vmFunction_ during createVmContext() to avoid repeated lookups on each execute() call (addresses [Perf] Memoize runtime call initialization and reuse calls between Graph::execute invocations #15).
  • Input list capacity pre-computation (runtime.h, graph.h): Pre-compute vmInputListCapacity_ during createVmContext() to avoid recomputing on every execute() call.
  • Remove output list (runtime.h): Pass nullptr for outputs in iree_vm_invoke since compiled functions write results in-place (void return).

Bug fixes

  • RAII cleanup for VM input list (runtime.h): Wrap iree_vm_list_t in IreeVmListUniquePtrType to prevent resource leaks on early-return error paths in execute().
  • Early VM context ownership (runtime.h): Assign vmContext_ immediately after iree_vm_context_create so failures in HAL module creation or bytecode loading don't leak the raw context.
  • Release HAL driver after AMDGPU device creation (runtime.h): createAMDGPUDevice() was not releasing the driver after device creation, leaking it on both success and failure paths. The device internally retains its own reference, so the caller's reference must be released separately (mirrors the pattern in createCPUDevice()).

Minor changes

  • New RAII types (backend.h): Add IreeVmContextDeleter, IreeVmListDeleter and corresponding unique_ptr aliases.
  • Include updates (backend.h, buffer.h, handle.h, runtime.h, test files): Replace <iree/runtime/api.h> with granular <iree/hal/api.h>, <iree/vm/api.h>, <iree/modules/hal/module.h>, etc.
  • Redundant initializer cleanup (tensor_attributes.h, graph.h): Remove = std::nullopt on std::optional members that default-construct to empty.
  • Comment marker normalization (tensor_attributes.h, graph_import.h, fusilli_plugin.cpp, test_fusilli_plugin_api.cpp, runtime.h): Normalize // C++ 20// C++20 across the codebase.
  • Comment and documentation updates (handle.h, graph.h, runtime.h, test_graph.cpp): Update comments to reflect VM/HAL terminology.

No changes to CMakeLists.txt, samples, benchmarks, or public API.

🤖 Generated with Claude Code

sjain-stanford and others added 2 commits February 14, 2026 20:46
…L API (#82)

Replace the iree_runtime_* wrapper layer (instance, session, call) with
direct iree_vm_* and iree_hal_* APIs for precise lifetime control and
device management:

- iree_runtime_instance_t -> iree_vm_instance_t + global driver registry
- iree_runtime_session_t -> iree_vm_context_t with explicit HAL/bytecode
  module registration
- iree_runtime_call_t -> iree_vm_list_t + iree_vm_invoke

Cache the resolved iree_vm_function_t in Graph during context creation
to avoid repeated lookups on each execute() call.

Use std::call_once for HAL driver registration since the global driver
registry persists across VM instance lifetimes and does not support
re-registration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>
# Conflicts:
#	include/fusilli/backend/runtime.h
#	include/fusilli/graph/graph.h
sjain-stanford and others added 19 commits February 17, 2026 03:01
iree_vm_invoke accepts NULL outputs when the function has void return.
Fusilli's compiled functions write results in-place to buffer views
passed as inputs, so no output marshaling is needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>
Replace unused iree/base/status.h with iree/base/allocator.h and
iree/base/config.h to directly provide iree_allocator_system and
iree_device_size_t.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>
Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>
Already transitively included via iree/modules/hal/module.h.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>
Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>
Moves the input count computation from execute() to createVmContext()
since the capacity is deterministic after compilation. Also fixes a
missing +1 for the workspace buffer slot.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>
Adds vm prefix to disambiguate Graph's IREE VM members from
other context_/function_ members in the codebase.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>
Unifies the ref ownership pattern across all VM input list pushes
to explicitly retain then move, matching the workspace buffer style.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>
Wrap the iree_vm_list_t in a unique_ptr with a custom deleter to
prevent resource leaks on early-return error paths. Follows the
existing IreeVmContextDeleter / IreeHalDeviceDeleter pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>
Move vmContext_ assignment immediately after iree_vm_context_create
so the unique_ptr owns cleanup from the start. Previously, failures
in HAL module creation or bytecode loading would leak the raw context.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>
std::optional default-constructs to empty, making the explicit
= std::nullopt unnecessary on member variable declarations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>
createAMDGPUDevice() was not releasing the driver after
iree_hal_driver_create_device_by_id, leaking it on both success
and failure paths. The device internally retains its own reference
to the driver, so the caller's reference must be released separately.
Mirrors the pattern already used in createCPUDevice().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>
Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>
@sjain-stanford sjain-stanford changed the title WIP: Migrate from high-level iree_runtime API to low-level VM/HAL API [Runtime] Migrate from high-level iree_runtime API to low-level VM/HAL API Feb 19, 2026
@sjain-stanford sjain-stanford marked this pull request as ready for review February 19, 2026 04:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Core] Move away from highlevel iree_runtime_* API [Perf] Memoize runtime call initialization and reuse calls between Graph::execute invocations

1 participant

Comments