benchmark/serve OOM under wgpu when auto-detect picks all adapters on multi-GPU Windows

# `benchmark` and `serve` OOM under wgpu when auto-detect picks all adapters on multi-GPU Windows

## Environment

- **Miner version**: `miner-cli 2.1.2+0a07bb2d`
- **OS**: Windows 11 Home, 10.0.26200, x64
- **Hardware**:
  - AMD Radeon Graphics (integrated, shares system RAM)
  - NVIDIA GeForce RTX 3070 Laptop GPU (discrete)
- **CPU**: 16 cores

## Symptom

Default `benchmark` (no flags) auto-detects **5 GPU "devices"** because wgpu enumerates each physical GPU once per backend (Vulkan + DX12) plus a `Microsoft Basic Render Driver` software fallback:

```
[engine_gpu] GPU engine initialized with 5 devices
[engine_gpu] 📊 GPU Device 0: AMD Radeon(TM) Graphics    | Backend: Vulkan | IntegratedGpu
[engine_gpu] 📊 GPU Device 1: NVIDIA GeForce RTX 3070 LP | Backend: Vulkan | DiscreteGpu
[engine_gpu] 📊 GPU Device 2: AMD Radeon(TM) Graphics    | Backend: Dx12   | IntegratedGpu
[engine_gpu] 📊 GPU Device 3: NVIDIA GeForce RTX 3070 LP | Backend: Dx12   | DiscreteGpu
[engine_gpu] 📊 GPU Device 4: Microsoft Basic Render Driver | Backend: Dx12 | Cpu
[miner_service] Auto-detected 5 GPU device(s). Using all available GPUs.
```

The miner spawns a worker on each, then panics during chunk allocation:

```
thread '<unnamed>' (24268) panicked at wgpu-27.0.1\src\backend\wgpu_core.rs:1570:18:
wgpu error: Out of Memory
```

This kills the whole process. Same failure under `serve`.

## Root cause (likely)

Three problems compound:

1. **Backend duplication** — each physical GPU appears twice (Vulkan + DX12). Workers on both backends compete for the same VRAM.
2. **Software adapter included** — the `Microsoft Basic Render Driver` (CPU-emulated) is enumerated as a GPU and given the same chunk size as a real GPU.
3. **Integrated GPU sharing system RAM** — the AMD iGPU shares the 16 GB system pool with everything else (browser, IDE, etc.). The default chunk size (100M hashes per the `serve --chunk-size` help text) overflows fast.

## Suggested fixes

1. **Filter out `DeviceType::Cpu` adapters by default.** A CPU-emulated software fallback is never useful for PoW mining and confuses auto-detect.
2. **Deduplicate by physical adapter**, preferring one backend per device. Vulkan is usually the right pick on NVIDIA + Linux/Windows; on macOS only Metal is available. A simple `(vendor_id, device_id)` dedupe would do it.
3. **Lower default chunk size when integrated GPU is selected**, or skip integrated GPUs by default and let users opt in via flag.
4. **Document `MINER_GPU_DEVICES` semantics** — currently it's a count, not an explicit selector. On a heterogeneous machine, "use 1 GPU" picks the first enumerated, which is usually the integrated one. A flag like `--gpu-adapter <name-substring>` or `--gpu-device-index <i>` would let users target the discrete card directly.

## Workaround attempted

- `WGPU_BACKEND=vulkan ... --gpu-devices 1` — produced 0 bytes of output, process appeared to hang or fail silently before any progress line. Did not OOM, but did not work either.
- `--cpu-workers 16 --gpu-devices 0` (CPU-only) — works fine, **360 KH/s** sustained on 16 cores. Confirms the OOM is GPU-pathway-specific.

## Why this matters

The OOM affects the most common Windows laptop hardware profile (one integrated + one discrete GPU). Right now a user with a strong NVIDIA card cannot run the GPU benchmark without manual tuning, and silent-fail cases like the Vulkan-only attempt above leave them with no signal at all.

## Happy to PR

If a maintainer can confirm the desired filtering policy (drop CPU-emulated, prefer one backend per physical device), I can put up a PR adding the filter + a `--gpu-adapter` selector and tests.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmark/serve OOM under wgpu when auto-detect picks all adapters on multi-GPU Windows #61

`benchmark` and `serve` OOM under wgpu when auto-detect picks all adapters on multi-GPU Windows

Environment

Symptom

Root cause (likely)

Suggested fixes

Workaround attempted

Why this matters

Happy to PR

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

benchmark/serve OOM under wgpu when auto-detect picks all adapters on multi-GPU Windows #61

Description

benchmark and serve OOM under wgpu when auto-detect picks all adapters on multi-GPU Windows

Environment

Symptom

Root cause (likely)

Suggested fixes

Workaround attempted

Why this matters

Happy to PR

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`benchmark` and `serve` OOM under wgpu when auto-detect picks all adapters on multi-GPU Windows