Add: L3/L2 host-device mapped region design#861
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces the design document for HostDeviceMappedRegion, a low-level parent-to-child/NPU communication primitive in Simpler. The review feedback suggests improving the public ABI structure HostDeviceMappedRegionInfo by making implicit padding fields explicit to prevent potential security vulnerabilities and FFI errors, as well as updating the corresponding static assertions to validate these offsets.
| typedef struct { | ||
| uint64_t host_data_ptr; | ||
| uint64_t device_data_ptr; | ||
| uint64_t data_bytes; | ||
| uint64_t host_signal_ptr; | ||
| uint64_t device_signal_ptr; | ||
| uint32_t signal_count; | ||
| uint64_t total_bytes; | ||
| uint32_t flags; | ||
| } HostDeviceMappedRegionInfo; |
There was a problem hiding this comment.
The HostDeviceMappedRegionInfo structure contains implicit padding between signal_count (offset 40) and total_bytes (offset 48), as well as trailing padding after flags (offset 56) to align the structure to an 8-byte boundary (64 bytes total). Implicit padding in public ABI structures can lead to security vulnerabilities (such as uninitialized memory/information leakage) when copied or serialized directly across process boundaries. It can also make cross-language bindings (e.g., Python ctypes or Rust FFI) more error-prone. Adding explicit reserved fields ensures that padding is visible, documented, and can be explicitly zero-initialized.
| typedef struct { | |
| uint64_t host_data_ptr; | |
| uint64_t device_data_ptr; | |
| uint64_t data_bytes; | |
| uint64_t host_signal_ptr; | |
| uint64_t device_signal_ptr; | |
| uint32_t signal_count; | |
| uint64_t total_bytes; | |
| uint32_t flags; | |
| } HostDeviceMappedRegionInfo; | |
| typedef struct { | |
| uint64_t host_data_ptr; | |
| uint64_t device_data_ptr; | |
| uint64_t data_bytes; | |
| uint64_t host_signal_ptr; | |
| uint64_t device_signal_ptr; | |
| uint32_t signal_count; | |
| uint32_t reserved0; | |
| uint64_t total_bytes; | |
| uint32_t flags; | |
| uint32_t reserved1; | |
| } HostDeviceMappedRegionInfo; |
| static_assert(offsetof(HostDeviceMappedRegionInfo, host_data_ptr) == 0); | ||
| static_assert(offsetof(HostDeviceMappedRegionInfo, device_data_ptr) == 8); | ||
| static_assert(offsetof(HostDeviceMappedRegionInfo, data_bytes) == 16); | ||
| static_assert(offsetof(HostDeviceMappedRegionInfo, host_signal_ptr) == 24); | ||
| static_assert(offsetof(HostDeviceMappedRegionInfo, device_signal_ptr) == 32); | ||
| static_assert(offsetof(HostDeviceMappedRegionInfo, signal_count) == 40); | ||
| static_assert(offsetof(HostDeviceMappedRegionInfo, total_bytes) == 48); | ||
| static_assert(offsetof(HostDeviceMappedRegionInfo, flags) == 56); | ||
| static_assert(sizeof(HostDeviceMappedRegionInfo) == 64); |
There was a problem hiding this comment.
Update the static assertions for HostDeviceMappedRegionInfo to explicitly validate the offsets of the newly added reserved0 and reserved1 fields. This ensures the structure layout remains exactly 64 bytes with the expected alignment and no implicit padding.
| static_assert(offsetof(HostDeviceMappedRegionInfo, host_data_ptr) == 0); | |
| static_assert(offsetof(HostDeviceMappedRegionInfo, device_data_ptr) == 8); | |
| static_assert(offsetof(HostDeviceMappedRegionInfo, data_bytes) == 16); | |
| static_assert(offsetof(HostDeviceMappedRegionInfo, host_signal_ptr) == 24); | |
| static_assert(offsetof(HostDeviceMappedRegionInfo, device_signal_ptr) == 32); | |
| static_assert(offsetof(HostDeviceMappedRegionInfo, signal_count) == 40); | |
| static_assert(offsetof(HostDeviceMappedRegionInfo, total_bytes) == 48); | |
| static_assert(offsetof(HostDeviceMappedRegionInfo, flags) == 56); | |
| static_assert(sizeof(HostDeviceMappedRegionInfo) == 64); | |
| static_assert(offsetof(HostDeviceMappedRegionInfo, host_data_ptr) == 0); | |
| static_assert(offsetof(HostDeviceMappedRegionInfo, device_data_ptr) == 8); | |
| static_assert(offsetof(HostDeviceMappedRegionInfo, data_bytes) == 16); | |
| static_assert(offsetof(HostDeviceMappedRegionInfo, host_signal_ptr) == 24); | |
| static_assert(offsetof(HostDeviceMappedRegionInfo, device_signal_ptr) == 32); | |
| static_assert(offsetof(HostDeviceMappedRegionInfo, signal_count) == 40); | |
| static_assert(offsetof(HostDeviceMappedRegionInfo, reserved0) == 44); | |
| static_assert(offsetof(HostDeviceMappedRegionInfo, total_bytes) == 48); | |
| static_assert(offsetof(HostDeviceMappedRegionInfo, flags) == 56); | |
| static_assert(offsetof(HostDeviceMappedRegionInfo, reserved1) == 60); | |
| static_assert(sizeof(HostDeviceMappedRegionInfo) == 64); |
Summary
HostDeviceMappedRegionprimitive.Backend Notes
halHostRegisterso the owner process can initialize and access the region through a host mapping while kernels receive device-visible pointers.-ENOTSUPstubs rather than missing ABI symbols.