diff --git a/docs/server/configuration/storage-configuration.mdx b/docs/server/configuration/storage-configuration.mdx index ae98ad8818..f812ea8f89 100644 --- a/docs/server/configuration/storage-configuration.mdx +++ b/docs/server/configuration/storage-configuration.mdx @@ -168,3 +168,23 @@ Enables memory prefetching mechanism if OS supports it. +## Storage.ReadAheadKbAlertThresholdInKb + +On Linux, RavenDB raises a startup warning when any block device's `read_ahead_kb` is above this threshold. A high `read_ahead_kb` amplifies I/O for the random-access database workload - see [System Configuration Recommendations](../../start/installation/system-configuration-recommendations.mdx) for guidance on tuning it. Set to `null` to disable the check. Has no effect on Windows or macOS. + +- **Type**: `int?` +- **Default**: `128` +- **Scope**: Server-wide only + + + +## Storage.UseSequentialReadAheadHintForJournalRecovery + +On Linux, hints the kernel (`posix_fadvise` with `POSIX_FADV_SEQUENTIAL`) to read journal files sequentially while a database loads, so a low `read_ahead_kb` tuned for the random-access workload causes a smaller startup slowdown. Set to `false` to opt out. Has no effect on Windows or macOS. + +- **Type**: `bool` +- **Default**: `true` on Linux, `false` on other platforms +- **Scope**: Server-wide or per database + + + diff --git a/docs/start/installation/system-configuration-recommendations.mdx b/docs/start/installation/system-configuration-recommendations.mdx index a6da0dfe79..673c956473 100644 --- a/docs/start/installation/system-configuration-recommendations.mdx +++ b/docs/start/installation/system-configuration-recommendations.mdx @@ -14,7 +14,7 @@ import LanguageContent from "@site/src/components/LanguageContent"; # Installation: System Configuration Recommendations -## Linux - Ubuntu 16.04 +## Linux RavenDB uses the resources of the machine it is running on, limited to the configuration limitation. In order to benefit from higher resources usage, consider the following setup: @@ -81,6 +81,57 @@ For details on current swapping partitions and priorities use: `} +### Tune block-device settings for random-access workloads (`read_ahead_kb`, `scheduler`, `rotational`) + +`read_ahead_kb` controls how much data the kernel prefetches on each read. RavenDB's typical workload is random-access over 8 KB Voron pages, so a high `read_ahead_kb` amplifies I/O - small random reads get fused into large ones, wasting disk bandwidth and page cache and raising latency under load (most visible with many databases or large datasets). Lowering it keeps each read close to what RavenDB actually requested, which is what this random-access workload usually wants. + +The trade-off is database startup: when a database loads, RavenDB reads its journal files sequentially, and a low `read_ahead_kb` slows that read down. RavenDB mitigates this internally - it hints the kernel to read the journals sequentially during load (controlled by [`Storage.UseSequentialReadAheadHintForJournalRecovery`](../../server/configuration/storage-configuration.mdx#storageusesequentialreadaheadhintforjournalrecovery)) - so you can keep `read_ahead_kb` low without a large startup penalty. + +For random-access SSD/NVMe workloads, values in the 8-64 KB range are worth experimenting with, versus the common 128 KB default. + +Check the current value (`` is e.g. `sda` or `nvme0n1`): + + + +{`cat /sys/block//queue/read_ahead_kb +`} + + + +Set it at runtime (requires root, resets on reboot): + + + +{`echo 32 > /sys/block//queue/read_ahead_kb +`} + + + +RavenDB captures the device's read-ahead value when it opens its files, so restart the server (or reload the databases) for a runtime change to take effect on already-open storage. + +Also check `rotational` and `scheduler`. They matter most on virtualized disks (Azure managed disks in particular), which are often misreported as `rotational` even when backed by SSD/NVMe, so the kernel picks HDD-oriented defaults. Bare-metal SSD/NVMe is usually detected correctly. For SSD/NVMe, `rotational=0` and `scheduler=none` are usually better. + +Persist these across reboots with a udev rule: + + + +{`# /etc/udev/rules.d/99-ravendb.rules +ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}="0" +ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/scheduler}="none" +ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/read_ahead_kb}="32" +ACTION=="add|change", KERNEL=="nvme*", ATTR{queue/read_ahead_kb}="32" + +# apply without reboot: +# sudo udevadm control --reload-rules && sudo udevadm trigger +`} + + + +This rule matches every `sd*` and `nvme*` device on the host. If other workloads share the machine, scope it to the disks RavenDB actually uses (find the device behind a path with `df ` or `lsblk`) so you don't retune unrelated disks. + + +The common 128 KB default works well for the majority of deployments - it is what RavenDB instances typically run with - and a lower value is worth pursuing only when heavy random-access load (many databases or large datasets on one host) shows up as high I/O wait or memory pressure. There is no universally correct `read_ahead_kb`. It depends on your disk, workload, and available memory, so test candidate values in your own environment and verify their effect on request/response times under real load before committing to one. RavenDB raises a startup warning when a block device's `read_ahead_kb` is above [`Storage.ReadAheadKbAlertThresholdInKb`](../../server/configuration/storage-configuration.mdx#storagereadaheadkbalertthresholdinkb) (default 128 KB), since an unusually high read-ahead causes the I/O amplification described above. +