Skip to content

fix(gpu): filter CPU-emulated adapters and dedupe per physical GPU#62

Open
adamtpang wants to merge 1 commit into
Quantus-Network:mainfrom
adamtpang:fix/gpu-adapter-filter
Open

fix(gpu): filter CPU-emulated adapters and dedupe per physical GPU#62
adamtpang wants to merge 1 commit into
Quantus-Network:mainfrom
adamtpang:fix/gpu-adapter-filter

Conversation

@adamtpang
Copy link
Copy Markdown

Summary

On Windows, wgpu enumerates each physical GPU once per backend (Vulkan + DX12) and additionally exposes a CPU-emulated software fallback (Microsoft Basic Render Driver). The current GpuEngine::init builds a worker context for every enumerated entry, which on a typical laptop (NVIDIA dGPU + AMD iGPU) yields 5 contexts and OOMs at chunk allocation. Repro and full benchmark logs are in #61.

This PR filters and deduplicates adapters before context construction:

  1. Drop adapters whose device_type == DeviceType::Cpu — software fallbacks are never useful for PoW mining.
  2. Dedupe by (vendor, device) physical-adapter key, keeping the entry with the highest-priority backend (Vulkan/Metal first, then DX12, then GL, then BrowserWebGpu).
  3. Sort the surviving list discrete → integrated → virtual so enumeration order is stable across runs.

Skipped and dropped adapters are logged at info level so users can see what's happening on a multi-GPU box.

What's not in this PR

I deliberately left out a --gpu-adapter <substring> selector flag for explicit per-machine targeting. That's a useful follow-up but adds CLI surface and crosses crate boundaries; this PR aims to fix the OOM with zero new flags so it can land surgically.

Tested by reading; flagging for build verification

I do not have a Rust toolchain on this machine and have not run cargo build/cargo test locally. The change is single-file (crates/engine-gpu/src/lib.rs), uses no new dependencies, and adds no public API surface — GpuEngine::try_new and device_count() are unchanged for callers. Both match arms use _ => fallbacks so they're forward-compatible with future non-exhaustive enum variants in wgpu::Backend / wgpu::DeviceType.

I'd appreciate a maintainer running cargo check -p engine-gpu and the existing GPU benches to confirm — and I'm happy to iterate on naming/log levels/style.

How to repro the original OOM

On Windows 10/11 with one integrated and one discrete GPU:

quantus-miner.exe benchmark

Default auto-detect picks all 5 wgpu entries, spawns workers on each, and panics with wgpu error: Out of Memory during chunk allocation. After this PR, the same command should select 2 contexts (one per physical GPU) on the same machine.

Test plan

  • cargo check -p engine-gpu
  • cargo clippy -p engine-gpu
  • cargo bench -p engine-gpu on a multi-GPU Windows machine; expect 1 context per physical GPU rather than per backend, and no OOM
  • Linux single-Vulkan-GPU sanity: confirm device_count() is unchanged (1)

Closes #61

On Windows wgpu enumerates each physical GPU once per backend (Vulkan
plus DX12) and additionally exposes a CPU-emulated software fallback
("Microsoft Basic Render Driver"). The current init path spawns a
worker context for every enumerated entry, which on a typical laptop
(NVIDIA dGPU + AMD iGPU) yields 5 contexts:

  Device 0: AMD Radeon Graphics    | Vulkan | IntegratedGpu
  Device 1: NVIDIA RTX 3070        | Vulkan | DiscreteGpu
  Device 2: AMD Radeon Graphics    | Dx12   | IntegratedGpu
  Device 3: NVIDIA RTX 3070        | Dx12   | DiscreteGpu
  Device 4: Microsoft Basic Render | Dx12   | Cpu

Spawning a chunk on every entry drives VRAM contention and OOMs the
process during benchmark/serve startup.

This commit adds two surgical filters before context construction:

1. Drop adapters whose `device_type == DeviceType::Cpu` — software
   fallbacks are never useful for PoW mining.
2. Dedupe per `(vendor, device)` physical-adapter key, keeping the
   highest-priority backend (Vulkan or Metal first, then DX12).

The filtered list is then sorted discrete > integrated > virtual so
device enumeration is stable across runs. Skipped/dropped adapters
are logged at info level for diagnosability.

No public API change; `GpuEngine::try_new` still takes only a batch
size, and `device_count()` now reflects unique physical adapters.

Closes Quantus-Network#61
adamtpang pushed a commit to adamtpang/quantus.com that referenced this pull request May 16, 2026
PR Quantus-Network#62 already filters Cpu adapters, dedupes each physical GPU across
backends, and sorts discrete-first. It still *returns* the integrated
GPU, so the worker pool round-robins onto it on hybrid laptops.

This commit drops the parallel structure from the previous commit and
restructures the change as a minimal follow-up that layers on Quantus-Network#62:

  * prefer_discrete_adapters(): runs after Quantus-Network#62's filter_and_dedupe_adapters
    and drops integrated/virtual/other adapters whenever any discrete GPU
    is present (falls back to keeping all if there is no discrete GPU, so
    integrated-only machines still mine).
  * discrete_preference_indices(): pure policy split out for unit testing
    with synthetic AdapterInfo; 4 tests cover hybrid-laptop, dual-discrete,
    integrated-only, and discrete+integrated+virtual.

backend_priority/filter_and_dedupe_adapters are reconstructed here as a
clearly-marked STAND-IN block so the branch compiles and tests run off
main; that block is deleted when rebasing onto the merged Quantus-Network#62.

The --gpu-devices N cap (breaking try_new signature change) is
intentionally NOT included here; it is a separate follow-up so Quantus-Network#62 stays
a tight single-purpose fix. try_new keeps its original signature, so
miner-service / benches / example are reverted to their main state.

https://claude.ai/code/session_01M8TsvfAHfST4D8zTDevK4x
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

benchmark/serve OOM under wgpu when auto-detect picks all adapters on multi-GPU Windows

1 participant