Excessive Memory Use by Session Cleanup Leading to OOM

**Describe the bug**
After upgrading to to Etherpad 2.7.3 we noticed our etherpad instance would consistently hit `FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory` and crash after about 15 minutes. After digging through some heap snapshots I'm reasonably confident that the source of the problem is session cleanup which begins on startup and was introduced in https://github.com/ether/etherpad/commit/da9f5ac4eed750af2448c42d7973b9b0b243fd54. I have since set sessionCleanup to false and memory usage is consistently low and the process has managed to live for over 15 minutes after this config change.

**To Reproduce**
Steps to reproduce the behavior:
1. Run etherpad for many years (I think our DB is over a decade old) to accumulate many `sessionstorage:.*` database records
2. Upgrade to 2.7.3 and set sessionCleanup to true (the default)
3. Watch memory usage climb until node and etherpad crash due to reaching the heap limit

I realize that run etherpad for a decade isn't a great reproducer, but I'm not super familiar with the database layout or how to artificially create session records. In theory this is possible and should reproduce things though.

We run etherpad without authentication (not sure if this impacts session handling in the DB). I suspect that we're affected by https://github.com/ether/etherpad/issues/5010 but then the new cleanup routine attempts to load all of these many records into memory at once and then we run out of memory.

**Expected behavior**
Session cleanup should use a reasonable amount of memory. Possibly by paging through the session records rather than loading them all at once. The new cleanup routine does `const keys = await DB.findKeys('sessionstorage:*', null);` maybe we can set limits on the number of records returned at one time and work our way through a reasonable batch size before proceeding to the next batch?

**Server (please complete the following information):**
 - Etherpad version: 2.7.3
 - OS: Debian Trixie
 - Node.js version (`node --version`): nodejs_version_info{version="v24.15.0",major="24",minor="15",patch="0"}
 - pnpm version (`pnpm --version`): 11.0.6 (note the bug template asks for npm version but we're using pnpm to match the upstream docker container)
 - Is the server free of plugins: No we have ep_headings2 installed
 - Are you using any abstraction IE docker? Yes, we build our own images using node:24-trixie-slim as a base but try to follow the approach used by the upstream Dockerfile.

**Additional context**
Our database is a mariadb 10.11 database. Not sure if that makes a difference in how the internal memory for query results is structured.

Also, the way I tracked this down was to run etherpad with `NODE_OPTIONS: "--heapsnapshot-signal=SIGUSR2"` then I grabbed a heap snapshot shortly after startup then again about 2 minutes later. Viewing these snapshots in Chrome's developer tools I was able to see that this chain of objects was retaining a significant amount of memory compared to the initial startup: `_pool in mysql_db_default -> _allConnections in Pool -> _list in denque -> [$INDEX] in Array -> _command in PoolConnection -> _rows in Query -> [$INDEX] in Array -> [$INDEX] in Array -> key in {key}`. Those key values appear to be `sessionstorage:.*` records. Each of them is listed as 0.1kB in size, but in aggregate having many of them is significant enough to run out of heap space.

We are currently running with sessionCleanup set to false. I figure that is no worse than we were before the upgrade, but it doesn't OOM.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Excessive Memory Use by Session Cleanup Leading to OOM #7830

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Excessive Memory Use by Session Cleanup Leading to OOM #7830

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions