Add GPU CTE and GPU Infinite Memory (UVM demand paging) docs by lukemartinlogan · Pull Request #9 · iowarp/docs

lukemartinlogan · 2026-03-09T23:09:48Z

Summary

gpu-cte.md: Documents GPU kernel integration with CTE — CPU fork-client setup, three-backend memory layout, IpcManagerGpuInfo, kernel-side AsyncPutBlob/AsyncGetBlob, routing table (ToLocalCpu / Local / LocalGpuBcast), and stop-flag polling pattern.
gpu-inf-mem.md: Documents the wrp_cte_uvm demand-paging module (GpuVirtualMemoryManager) — GpuVmmConfig, touchPage/touchRange/evictPage, prefetch window, CTE blob store backing, separate transfer/compute streams, state queries, full example, CMake integration, and hardware requirements.

Test plan

Verify both pages render correctly in the Docusaurus sidebar under Context Transfer Engine
Check all code blocks are valid (headers, API calls match actual source)
Confirm gpu-inf-mem.md accurately reflects context-transfer-engine/uvm/include/wrp_cte/uvm/gpu_vmm.h

🤖 Generated with Claude Code

Document how CUDA/ROCm kernels interact with the Chimaera runtime, including task definition, client API, host-side setup, configuration parameters, and troubleshooting. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Move base-modules section after the new GPU clients page to maintain logical sidebar ordering. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Update all queue field names to current API (cpu2gpu_queue, gpu2cu_queue, gpu2gpu_queue) throughout examples and narrative - Update ClientConnectTask handshake table to new field names - Add gpu_heap_backend (GpuMalloc, 9000+gpu_id) and gpu2gpu device-memory backend (3000+gpu_id) to server backend table and memory layout diagrams - Replace single-ArenaAllocator description with dual-allocator architecture: ArenaAllocator (HSHM_DEFAULT_ALLOC_GPU_T, primary bump-pointer) + BuddyAllocator (CHI_GPU_HEAP_T, serialization heap with individual free) - Document CHIMAERA_GPU_ORCHESTRATOR_INIT(gpu_info, num_blocks) macro for multi-block client kernels; CHI_CLIENT_GPU_INIT alias - Add GetClientGpuInfo(gpu_id) to host-side setup — fills all IpcManagerGpuInfo fields automatically for same-process kernel launches - Add Performance section: ~200 µs BuddyAllocator vs ~400 µs device malloc, corrected latency measurement explanation, arena-reset semantics - Update server phase-1 init sequence for new backends and queue layout - Update GPU memory backend size guidance for primary arena vs heap backends Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

gpu-cte.md: documents GPU kernel integration with CTE — CPU fork-client setup, three-backend memory layout, IpcManagerGpuInfo, kernel-side AsyncPutBlob/GetBlob, routing table, and stop-flag polling pattern. gpu-inf-mem.md: documents the GpuShmMmap UVM backend — why UVM is chosen over pinned memory (no HostNativeAtomicSupported required), shm_init API, memory layout, kernel passing, IPC manager registration, per-block allocator construction, destruction rules, and comparison table against MallocBackend / PosixShmMmap / GpuMalloc. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replace placeholder UVM backend doc with accurate documentation of the GpuVirtualMemoryManager class in context-transfer-engine/uvm. Covers: - GpuVmmConfig fields and defaults - init/destroy lifecycle - touchPage / touchRange / touchPageAsync demand page-in - evictPage / evictPageAsync page-out to host RAM or CTE blob store - prefetch_window auto-prefetch - state queries (isMapped, isEvictedToHost, getMappedPageCount, ...) - separate transfer/compute stream model - CTE blob store backing option - full end-to-end example - CMake integration and hardware requirements Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

lukemartinlogan and others added 5 commits March 3, 2026 05:55

Add GPU client kernels documentation

b60fa05

Document how CUDA/ROCm kernels interact with the Chimaera runtime, including task definition, client API, host-side setup, configuration parameters, and troubleshooting. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Rename base-modules from sidebar position 6 to 20

c7d6c3f

Move base-modules section after the new GPU clients page to maintain logical sidebar ordering. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

lukemartinlogan merged commit 0ca139a into main Mar 9, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GPU CTE and GPU Infinite Memory (UVM demand paging) docs#9

Add GPU CTE and GPU Infinite Memory (UVM demand paging) docs#9
lukemartinlogan merged 5 commits intomainfrom
quickstart-and-sidebar-reorder

lukemartinlogan commented Mar 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lukemartinlogan commented Mar 9, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant