fix: page sessionstorage cleanup to avoid OOM (#7830)#7831
Conversation
SessionStore._cleanup() previously called `findKeys('sessionstorage:*',
null)`, materialising every session key into a single array. On decade-
old MariaDB installs with millions of sessions this OOMs the node
process within ~15 minutes — see #7830.
Switch to ueberdb2 6.1.0's findKeysPaged with a 500-key page size, and
yield to the event loop between pages so the DB driver can release each
page's buffered rows and request handlers can interleave.
The break is now driven by `page.length === 0` rather than `page.length
< CLEANUP_PAGE_SIZE` so a stubbed/throttled paged source still iterates
the full keyspace.
Adds a regression test that seeds 50 sessionstorage rows, monkey-patches
`DB.findKeysPaged` to use a 4-key page, runs cleanup, and asserts every
expired row is removed plus every valid row preserved across page
boundaries.
Closes #7830
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
Review Summary by QodoFix session cleanup OOM via paged database iteration
WalkthroughsDescription• Switch SessionStore cleanup from unbounded findKeys() to paged findKeysPaged() with 500-key limit • Yield to event loop between pages for DB driver buffer release and request interleaving • Bump ueberdb2 dependency to ^6.1.0 for new paging method support • Add regression test verifying cleanup iterates full keyspace across page boundaries Diagramflowchart LR
A["SessionStore._cleanup()"] -->|"old: findKeys loads all"| B["OOM on large keyspaces"]
A -->|"new: findKeysPaged 500-key pages"| C["Bounded memory usage"]
C -->|"yield between pages"| D["DB buffers released"]
D -->|"request handlers interleave"| E["Stable operation"]
File Changes1. src/node/db/DB.ts
|
Code Review by Qodo
1.
|
Four follow-ups raised by Qodo on the session cleanup paging fix: - DB.ts: fail-fast at init() if any required wrapper method (incl. findKeysPaged) is missing, so a stale ueberdb2 pin surfaces at boot rather than crashing the first cleanup run an hour later. - SessionStore: bound a single _cleanup() run to 10 minutes. Under sustained session creation the keyspace can grow faster than cleanup drains it; without a budget the next scheduled run would never fire. When the budget hits, log a warning and let the next run continue. - SessionStore: log the defensive `page[0] <= after` cursor-stall break. Previously the loop exited silently, leaving expired rows behind with no operator-visible signal of the backend regression. - Tests: the paged-cleanup regression test now removes both expiredSids AND validSids in finally, so a failed assertion doesn't leak rows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Addressed Qodo's four findings in 63c2493:
29 SessionStore backend tests still green; ts-check clean. (Wider CI still blocked on the ueberdb2 6.1.0 npm publish.) |
CHANGELOG.md picks up an entry under 3.1.0 Notable fixes describing the OOM cause, the paged iteration, the 10-minute per-run budget, the cursor-stall logging, and the fail-fast init guard. settings.json.template's sessionCleanup comment adds the page-size, budget, and pointer to #7830 so admins can reason about the new behaviour from the template alone. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Permanent fix path is now ether/ueberDB#983 — migrates the ueberDB publish workflow from a stored Once #983 is merged and the npm UI step is done, the next push to ueberDB main publishes |
Now that ether/ueberDB#983 unblocked the publish workflow (OIDC trusted publishing), ueberdb2 6.1.2 is live on npm and the `^6.1.0` pin in src/package.json resolves cleanly. Resolves the ERR_PNPM_OUTDATED_LOCKFILE that was blocking CI on this PR. 29 SessionStore backend tests still green against the published tarball. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
SessionStore._cleanup()fromfindKeys('sessionstorage:*', null)(loads everything) tofindKeysPagedwith a 500-key page sizeueberdb2to^6.1.0for the new method (see feat: add paged findKeys (memory-bounded iteration) ueberDB#981)Why
Per the reporter on #7830, after 2.7.3 their decade-old MariaDB install OOMed within ~15 minutes. Heap snapshots pinned retention on
_pool → _allConnections → … → _command._rows → keys— the unboundedfindKeysresult tied upmysql2PoolConnection row buffers (and dominated JS heap with millions of session keys). Paging at the DB layer turns that into bounded N-row reads.Cleanup termination is now driven by an empty page rather than
page.length < CLEANUP_PAGE_SIZE, so the loop still iterates a stubbed/throttled paged source correctly (caught by the new test on the first run).Blocked on
Test plan
pnpm --filter ep_etherpad-lite run ts-checkcleanmocha tests/backend/specs/SessionStore.ts— 29 passing, including the new "pages across a large sessionstorage keyspace" case ("removed 25 expired/stale sessions out of 50" confirms cross-page iteration)Notes for the maintainer
findKeysPagedfalls back to JS-side slicing for backends that don't natively support it (sqlite/rusty/redis/mongo/...). Correctness preserved; OOM-resistance for those backends is a follow-up.findKeyspattern exists inPadManager.ts:77andAuthorManager.ts:{361,369,484,495}. Out of scope for this fix but worth tracking — happy to file as a follow-up issue.Closes #7830
🤖 Generated with Claude Code