Skip to content

Conversation

@Ekhorn
Copy link

@Ekhorn Ekhorn commented Dec 24, 2025

Description

Changes the datapoint exports to write to stdout rather than the filesystem and uses piped streams so data can be streamed to the client without having to hold all datapoints in memory.

I did some tests to ensure the memory footprint stays reasonable. With 10 million datapoints exported multiple times in parallel (4 active streams) the memory usage spikes from ~150MB to ~350MB. The zip files that are downloaded each are ~131MB and the datapoints.csv files are ~620MB.

Screencast.From.2025-12-24.14-32-01.webm

Checklist

  • 1. Acceptance criteria of the linked issue(s) are met
  • 2. Tests are written and all tests pass
  • 3. Changes are manually tested by you and the reviewer

@Ekhorn Ekhorn force-pushed the enhancement/use-stdout-for-datapoint-exports branch from 4e2addf to 8d288e1 Compare December 24, 2025 14:15
@Ekhorn Ekhorn force-pushed the enhancement/use-stdout-for-datapoint-exports branch from 8d288e1 to 57f349c Compare December 24, 2025 14:46
@Ekhorn
Copy link
Author

Ekhorn commented Dec 24, 2025

@Hackerberg43

The backend tests pass locally.

image

And so does your frontend test 😉

image

I do need approval to run the pipeline 😄

Comment on lines -60 to -61
public static final String OR_DATA_POINTS_EXPORT_LIMIT = "OR_DATA_POINTS_EXPORT_LIMIT";
public static final int OR_DATA_POINTS_EXPORT_LIMIT_DEFAULT = 1000000;
Copy link
Author

@Ekhorn Ekhorn Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@richturner Is it okay to get rid of this? The default CSV exports with 10million datapoints seems not to have a big impact. The other CSV format queries that @Hackerberg43 added take longer (and the request times out by axios on the frontend).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above was added quite recently I remember, as otherwise you would just get a general from the timeout, not letting you know what the problem is.
The other formats do some database manipulation first, so will take longer yes.

@Ekhorn Ekhorn marked this pull request as ready for review December 24, 2025 14:52
@Hackerberg43 Hackerberg43 merged commit d184d26 into Hackerberg43:enhancement/dataExportPage Dec 24, 2025
1 check passed
@Ekhorn Ekhorn deleted the enhancement/use-stdout-for-datapoint-exports branch January 9, 2026 09:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants