Skip to content

blockdevice: add SysBlockDeviceRotational for targeted rotational reads#824

Merged
SuperQ merged 1 commit into
prometheus:masterfrom
ruthwikkakumani:blockdevice-add-rotational-method
Jun 18, 2026
Merged

blockdevice: add SysBlockDeviceRotational for targeted rotational reads#824
SuperQ merged 1 commit into
prometheus:masterfrom
ruthwikkakumani:blockdevice-add-rotational-method

Conversation

@ruthwikkakumani

@ruthwikkakumani ruthwikkakumani commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

What

Add FS.SysBlockDeviceRotational(device string) (uint64, error) — a targeted
single-file reader for /sys/block/<device>/queue/rotational.

Why

Required by node_exporter's diskstats collector to fix a 10× scrape-latency
regression
on systems with many block devices
(prometheus/node_exporter#3282).

The previous implementation called SysBlockDeviceQueueStats() which reads
~30 sysfs files per device to populate a full BlockQueueStats struct,
consuming only the Rotational field. On systems with 1,000+ devices this
caused 30,000+ unnecessary file reads per scrape.

SysBlockDeviceRotational reduces this to exactly 1 read per device.

How

Follows the same single-file read pattern as SysBlockDeviceSize() using
util.ReadUintFromFile. Returns 1 for rotational (HDD), 0 for
non-rotational (SSD/NVMe).

A unit test (TestSysBlockDeviceRotational) is included, using the existing
sda fixture (rotational=1) and verifying error behaviour for devices without
a queue/rotational file.

Related

@ruthwikkakumani ruthwikkakumani force-pushed the blockdevice-add-rotational-method branch from 13c41b6 to a9b3d58 Compare June 17, 2026 17:43
ruthwikkakumani added a commit to ruthwikkakumani/node_exporter that referenced this pull request Jun 17, 2026
Replace fs.SysBlockDeviceRotational() (which required a replace directive
pointing to a personal procfs fork) with an inline sysFilePath()-based
read of /sys/block/<dev>/queue/rotational.

This makes the PR self-contained against the official prometheus/procfs
v0.20.1 and unblocks CI. The 1-file-per-device I/O reduction (vs the
previous ~30-file SysBlockDeviceQueueStats call) is preserved, fixing
the 10× scrape regression on systems with many block devices (prometheus#3282).

Once prometheus/procfs#824 lands, a follow-up can switch to
blockdevice.FS.SysBlockDeviceRotational() per SuperQ's suggestion.

Signed-off-by: Ruthwik Kakumani <ruthwikkakumani08@gmail.com>
ruthwikkakumani added a commit to ruthwikkakumani/node_exporter that referenced this pull request Jun 17, 2026
…ometheus#3282)

Fix a 10× scrape-latency regression introduced in prometheus#3022 on systems with
many block devices.

Previously, Update() called SysBlockDeviceQueueStats(dev) for every
block device per scrape, reading ~30 sysfs files per device. On systems
with 1,000+ devices this resulted in 30,000+ unnecessary file reads per
scrape, causing timeouts.

Replace the expensive struct call with a targeted rotationalLabel()
helper that delegates to blockdevice.FS.SysBlockDeviceRotational(dev)
(added in prometheus/procfs#824). This reads exactly 1 sysfs file per
device:
  /sys/block/<device>/queue/rotational

Returns "1" for rotational (HDD) and "0" for non-rotational (SSD/NVMe),
failing closed to "0" on any error — matching the zero-initialised-struct
behaviour of the old code.

Also add:
- TestRotationalLabel: dynamic t.TempDir()-based tests via blockdevice.FS,
  covering HDD, SSD, and missing-file cases.
- BenchmarkDiskstatsUpdate: measures the full Update() call to catch
  future per-device sysfs I/O regressions before merge.

Fixes prometheus#3282

Signed-off-by: Ruthwik Kakumani <ruthwikkakumani08@gmail.com>
ruthwikkakumani added a commit to ruthwikkakumani/node_exporter that referenced this pull request Jun 17, 2026
…ometheus#3282)

Fix a 10× scrape-latency regression introduced in prometheus#3022 on systems with
many block devices.

Previously, Update() called SysBlockDeviceQueueStats(dev) for every
block device per scrape, reading ~30 sysfs files per device. On systems
with 1,000+ devices this resulted in 30,000+ unnecessary file reads per
scrape, causing timeouts.

Replace the expensive struct call with a targeted rotationalLabel()
helper that delegates to blockdevice.FS.SysBlockDeviceRotational(dev)
(added in prometheus/procfs#824). This reads exactly 1 sysfs file per
device:
  /sys/block/<device>/queue/rotational

Returns "1" for rotational (HDD) and "0" for non-rotational (SSD/NVMe),
failing closed to "0" on any error — matching the zero-initialised-struct
behaviour of the old code.

Also add:
- TestRotationalLabel: dynamic t.TempDir()-based tests via blockdevice.FS,
  covering HDD, SSD, and missing-file cases.
- BenchmarkDiskstatsUpdate: measures the full Update() call to catch
  future per-device sysfs I/O regressions before merge.

Fixes prometheus#3282

Signed-off-by: Ruthwik Kakumani <ruthwikkakumani08@gmail.com>
Add FS.SysBlockDeviceRotational(device string) (uint64, error), a
lightweight counterpart to SysBlockDeviceQueueStats that reads only
/sys/block/<device>/queue/rotational.

This is needed by node_exporter's diskstats collector to fix a 10×
scrape-latency regression on systems with many block devices
(prometheus/node_exporter#3282). The previous implementation called
SysBlockDeviceQueueStats which reads ~30 sysfs files per device.
SysBlockDeviceRotational reduces that to exactly 1 read per device.

Returns 1 for rotational (HDD), 0 for non-rotational (SSD/NVMe).
Follows the same single-file read pattern as SysBlockDeviceSize.

Signed-off-by: ruthwikkakumani <ruthwikkakumani@users.noreply.github.com>
@ruthwikkakumani ruthwikkakumani force-pushed the blockdevice-add-rotational-method branch from a9b3d58 to 6f28a71 Compare June 18, 2026 08:49
ruthwikkakumani added a commit to ruthwikkakumani/node_exporter that referenced this pull request Jun 18, 2026
…ometheus#3282)

Fix a 10× scrape-latency regression introduced in prometheus#3022 on systems with
many block devices.

Previously, Update() called SysBlockDeviceQueueStats(dev) for every
block device per scrape, reading ~30 sysfs files per device. On systems
with 1,000+ devices this resulted in 30,000+ unnecessary file reads per
scrape, causing timeouts.

Replace the expensive struct call with a targeted rotationalLabel()
helper that delegates to blockdevice.FS.SysBlockDeviceRotational(dev)
(added in prometheus/procfs#824). This reads exactly 1 sysfs file per
device:
  /sys/block/<device>/queue/rotational

Returns "1" for rotational (HDD) and "0" for non-rotational (SSD/NVMe),
failing closed to "0" on any error — matching the zero-initialised-struct
behaviour of the old code.

Also add:
- TestRotationalLabel: dynamic t.TempDir()-based tests via blockdevice.FS,
  covering HDD, SSD, and missing-file cases.
- BenchmarkDiskstatsUpdate: measures the full Update() call to catch
  future per-device sysfs I/O regressions before merge.

Fixes prometheus#3282

Signed-off-by: Ruthwik Kakumani <ruthwikkakumani08@gmail.com>
@ruthwikkakumani

Copy link
Copy Markdown
Contributor Author

@SuperQ thanks for the approval! Could you also approve the CI workflow runs so the checks can complete? They're currently blocked on "2 workflows awaiting approval". Thank you!

@SuperQ SuperQ merged commit 66a9e6e into prometheus:master Jun 18, 2026
7 checks passed
ruthwikkakumani added a commit to ruthwikkakumani/node_exporter that referenced this pull request Jun 18, 2026
…ometheus#3282)

Fix a 10× scrape-latency regression introduced in prometheus#3022 on systems with
many block devices.

Previously, Update() called SysBlockDeviceQueueStats(dev) for every
block device per scrape, reading ~30 sysfs files per device. On systems
with 1,000+ devices this resulted in 30,000+ unnecessary file reads per
scrape, causing timeouts.

Replace the expensive struct call with a targeted rotationalLabel()
helper that delegates to blockdevice.FS.SysBlockDeviceRotational(dev)
(added in prometheus/procfs#824). This reads exactly 1 sysfs file per
device:
  /sys/block/<device>/queue/rotational

Returns "1" for rotational (HDD) and "0" for non-rotational (SSD/NVMe),
failing closed to "0" on any error — matching the zero-initialised-struct
behaviour of the old code.

Also add:
- TestRotationalLabel: dynamic t.TempDir()-based tests via blockdevice.FS,
  covering HDD, SSD, and missing-file cases.
- BenchmarkDiskstatsUpdate: measures the full Update() call to catch
  future per-device sysfs I/O regressions before merge.

Fixes prometheus#3282

Signed-off-by: Ruthwik Kakumani <ruthwikkakumani08@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants