Confirmed on quantus-miner v3.0.1 in an NVIDIA RTX 50-series (Blackwell) environment on Ubuntu Linux.

Confirmed on quantus-miner v3.0.1 in an NVIDIA RTX 50-series (Blackwell) environment on Ubuntu Linux.

Environment
- quantus-miner: v3.0.1
- quantus-node: v0.6.3
- quantus-cli: v1.3.3
- GPU generation: NVIDIA RTX 50-series (Blackwell)
- Driver: NVIDIA 590.48.01
- Backend: Vulkan
- OS: Ubuntu 24.04 LTS
- Node state: fully synced, peers=12, isSyncing=false

Setup
- External miner connected to the node via QUIC on 127.0.0.1:9833
- Node started with:
  - --validator
  - --chain planck
  - --miner-listen-port 9833
  - --prometheus-port 9616
  - --rewards-inner-hash <set>
- Miner started with:
  - serve
  - --node-addr 127.0.0.1:9833
  - --gpu-devices 2
  - --cpu-workers 0
  - --metrics-port 9900

Node side looks healthy
- chain_height increases normally
- isSyncing=false
- peers=12
- block_time_ema around 7-10 seconds
- last_block_duration around 2-5 seconds
- Over 30 seconds, chain_height advanced by 2 blocks while I watched.
This does not appear to be a sync or connectivity issue.

What I observed on the miner side
- The miner connects successfully to the node over QUIC.
- Mining jobs are received continuously (hundreds of jobs over tens of minutes).
- Jobs are dispatched to GPU workers for every detected GPU.
- All expected GPUs are detected.
- The miner process is visible as "quantus-miner" in nvidia-smi pmon on every GPU.

However, effective GPU mining performance is extremely low compared with expectations.

Prometheus metrics on the miner are very minimal. Only these appear:
- miner_active_jobs 1
- miner_cpu_workers 0
- miner_effective_cpus 12
- miner_gpu_devices (number of GPUs)
- miner_workers (number of workers)

Metrics that I would expect but do NOT see:
- miner_gpu_hash_rate
- miner_hash_rate
- miner_hashes_total

GPU utilization observed via nvidia-smi
- GPU compute utilization is consistently around 0%.
- Power draw stays very low (single-digit to low double-digit watts).
- Core clock stays in low hundreds of MHz.
- Memory usage is small.
- GPUs never spin up to a real mining load, even though jobs are being received very frequently.

Relevant miner log excerpts
- QUIC connection established to 127.0.0.1:9833
- Connected to node
- Waiting for mining jobs
- Received job: id=...
- Received job: id=...
- Received job: id=...
- Job dispatched to GPU workers
- Worker thread assigned to GPU device 0
- Worker thread assigned to GPU device 1
- GPU dispatch config:
  - 3276 workgroups × 256 threads

So the miner is absolutely talking to the node and receiving jobs. The problem is on the GPU dispatch side.

Benchmark result
I also ran the built-in GPU benchmark separately:
./quantus-miner benchmark --gpu-devices 2 --duration 30

Detected adapters included the NVIDIA RTX 50-series GPUs and a CPU fallback adapter (llvmpipe). Benchmark output consistently shows:

- Max hardware workgroups: 65535
- Optimal workgroups: 3276
- Per-GPU throughput: around 5-6 MH/s
- Total throughput: around 12-13 MH/s

This is far below the GPU mining range described in the official guide
(roughly 500-1000 MH/s for GPU mining).

Why this looks like Issue #52
Issue #52 explains that RTX 50-series cards may fall through to the generic fallback dispatch path because there is no explicit "rtx 50" match in the vendor-specific dispatch logic.

The reported values
- Max hardware workgroups: 65535
- Optimal workgroups: 3276

exactly match the generic fallback branch:
- 65535 / 20 = 3276

So this RTX 50-series setup appears to be using the fallback dispatch path rather than an RTX 40-class dispatch, which would cap effective performance well below the hardware capability.

This would also explain why:
- Jobs are received at a very high rate,
- GPU utilization never really climbs,
- And miner_gpu_hash_rate / miner_hashes_total do not appear at all:
the miner is doing small dispatch work and exhausting each range quickly.

Impact
- The miner looks "connected and mining" from logs, but actual GPU throughput is dramatically lower than expected.
- There is no Prometheus warning or metric that clearly signals this state.
- From the operator side, this is very hard to diagnose without running the built-in benchmark.

Suggested fix
Please add an explicit "rtx 50" branch before the existing "rtx 40" branch, or replace the current substring matching with a more future-proof GPU generation parser.

For example, logic equivalent to:
- if the adapter name indicates RTX 50-series, use the same dispatch class as RTX 40
- otherwise keep the existing behavior

Additional improvements that would help operators
- Emit a warning when an unknown GPU generation falls back to the generic dispatch path.
- Export proper Prometheus metrics for actual GPU hashrate and total hashes during external mining, not only during benchmark.
- Optionally log when dispatched jobs are completed far faster than expected, as a hint that dispatch is under-sized.

If it helps, I can provide:
- more detailed miner debug logs,
- more detailed benchmark output,
- full node Prometheus metrics,
- nvidia-smi query/pmon output,
- node system_health output.

Happy to help test a candidate fix on RTX 50-series hardware once it is available.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confirmed on quantus-miner v3.0.1 in an NVIDIA RTX 50-series (Blackwell) environment on Ubuntu Linux. #55

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Confirmed on quantus-miner v3.0.1 in an NVIDIA RTX 50-series (Blackwell) environment on Ubuntu Linux. #55

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions