Archive node: unbounded anon memory growth (~3-4 GB/h) under sustained RPC load → hard cgroup throttle / hang


## Summary

A `--state-pruning archive` node serving sustained JSON-RPC traffic exhibits **unbounded growth of anonymous (non-reclaimable) process memory at a steady ~3–4 GB/hour**. The growth is real process memory (`memory.stat` `anon`), not page cache, and it is **not bounded by `--db-cache` or `--trie-cache-size`**. Left running, the node climbs to the host/cgroup memory limit and — with swap disabled — gets hard-throttled by `memory.high` (millions of throttle events), at which point RPC stops responding and the node effectively hangs until restarted.

Reproduced on **two versions** (a Nov-2025 build and the current **v3.4.2-415**), so this is not a recently-introduced regression — it appears inherent to the archive node under RPC load.

## Environment

- **subtensor**: `v3.4.2-415` (also reproduced on a Nov-2025 `v3.2.9`-era build). Note: `system_version` RPC returns `4.0.0-dev-unknown` on both, so it's not a useful version discriminator.
- **OS**: Ubuntu 24.04.4 LTS
- **Host**: 16 vCPU, 62 GiB RAM, NVMe (RocksDB backend, `db/full`)
- **Chain**: finney mainnet, archive node (~3.7 TB DB)
- **Run via**: systemd unit, with a cgroup `MemoryHigh=48G` / `MemoryMax=52G` and `MemorySwapMax=0`.

### Launch flags
```
node-subtensor \
  --chain <finney-raw-spec> --base-path <data> \
  --state-pruning archive --blocks-pruning archive \
  --rpc-external --rpc-cors all --rpc-methods unsafe \
  --rpc-port 9944 --port 30333 \
  --rpc-max-connections 1000 --no-mdns \
  --rpc-max-response-size 256 --rpc-max-request-size 256 \
  --in-peers 75 --out-peers 25 \
  --prometheus-external --prometheus-port 9615 \
  --db-cache 8192 --trie-cache-size 4294967296 \
  --runtime-cache-size 4 --max-runtime-instances 8 \
  --wasm-execution compiled --wasmtime-instantiation-strategy pooling-copy-on-write
```

The node serves a steady stream of **archive state RPCs** from an external client (historical `state_getStorage`, `state_call`, `state_getReadProof`, `state_queryStorageAt`, etc. against old block hashes).

## Observed behaviour

Anonymous memory grows roughly linearly under load and never plateaus:

```
# /sys/fs/cgroup/.../subtensor.service/memory.stat
anon ≈ 24–39 GB        ← real, non-reclaimable
file ≈ 0.1–4.5 GB      ← page cache (small)
```

Growth curve after a fresh restart (netdata, RSS):
```
t+0min   ~10 GB   (post-restart)
t+1h     ~18 GB
t+2h     ~23.5 GB   steady creep ≈ +3–4 GB/h, NOT decelerating to zero
...      climbs linearly
t+~11h   ~48 GB     hits MemoryHigh
```

At the ceiling, with `MemorySwapMax=0`, the kernel cannot reclaim the (anonymous) memory, so it throttles via `memory.high`:
```
# memory.events at the ceiling
high  254531610     ← ~254M throttle events
max   0
oom_kill 0
```
The node spins in reclaim, RPC latency explodes, and `system_health` RPC eventually times out — the node is effectively hung until restarted. (oom_kill is 0 because it throttles rather than OOMs.)

Key points:
- The leaked memory is **`anon`, not page cache** — so it is genuinely held by the process and cannot be reclaimed.
- It **vastly exceeds the configured caches** (8 GiB db-cache + 4 GiB trie-cache = 12 GiB, but anon reaches 24–39 GiB and keeps climbing).
- **Reducing `--db-cache` 8192 → 4096 did NOT reduce the steady-state footprint** and did not stop the climb — confirming the growth is not the configured block cache.
- Growth correlates with **RPC query load**; an idle/lite node does not exhibit it at the same rate.
- A restart drops it back to ~10 GB and the cycle repeats.

## Steps to reproduce

1. Run an archive node (`--state-pruning archive`) on finney with RocksDB.
2. Subject it to sustained archive-state JSON-RPC queries against historical block hashes (e.g. `state_getReadProof` / `state_call` / `state_getStorage` at old blocks), as a high-traffic archive RPC provider would.
3. Watch `anon` in the service's cgroup `memory.stat` (or RSS) over several hours.
4. Observe a steady ~3–4 GB/h climb with no plateau, until the host/cgroup limit is reached.

## Expected behaviour

Steady-state memory should plateau (bounded by the configured caches + a stable working set) rather than growing unbounded under continuous RPC load.

## Impact

On a RAM-constrained host this forces a **periodic restart treadmill** (every ~6–8 h) to avoid the node hanging itself at the memory ceiling. For archive RPC providers this means recurring downtime and degraded tail latency as the node approaches the limit.

## Current workaround

Scheduled restart every ~8 h (before the node reaches the throttle ceiling). This is a band-aid, not a fix.

## Questions for maintainers

- Is unbounded `anon` growth under archive RPC load a known issue?
- Is it related to the trie/state cache not honouring `--trie-cache-size` under archive queries, the wasmtime pooling allocator, RPC subscription/connection buffers, or something else?
- Is there a flag to bound the per-process memory under archive RPC load that we've missed?

Happy to provide netdata exports, `memory.stat` snapshots over time, heaptrack/massif profiles, or an RPC query sample if useful.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Archive node: unbounded anon memory growth (~3-4 GB/h) under sustained RPC load → hard cgroup throttle / hang #2724

Summary

Environment

Launch flags

Observed behaviour

Steps to reproduce

Expected behaviour

Impact

Current workaround

Questions for maintainers

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Archive node: unbounded anon memory growth (~3-4 GB/h) under sustained RPC load → hard cgroup throttle / hang #2724

Description

Summary

Environment

Launch flags

Observed behaviour

Steps to reproduce

Expected behaviour

Impact

Current workaround

Questions for maintainers

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions