Skip to content

fix: High memory usage from sample-level result accumulators#109

Open
akrivi wants to merge 2 commits into
mainfrom
al/fix_threads_samples_memory
Open

fix: High memory usage from sample-level result accumulators#109
akrivi wants to merge 2 commits into
mainfrom
al/fix_threads_samples_memory

Conversation

@akrivi
Copy link
Copy Markdown
Collaborator

@akrivi akrivi commented May 20, 2026

Previously, each thread allocated sample-level accumulators sized for the full Monte Carlo sample count. For sample-based result specs like ShortfallSamples, FlowSamples, StorageEnergySamples, GeneratorAvailability, this caused memory usage to scale as O(regions x timesteps x samples x threads).

This PR changes threaded execution so sample-based result accumulators are partitioned by each worker’s assigned sample range.

Example for 3 threads:

Before, each thread allocated the whole sample matrix:

Thread 1: [ total number of samples] 

Thread 2: [ total number of samples]

Thread 3: [ total number of samples]

Now, samples are split into ranges:

Thread 1: [ 1/3 (total number of samples)] 

Thread 2: [ 1/3 (total number of samples)]

Thread 3: [ 1/3 (total number of samples)]

Benchmarks

System: Guam 2028, 13 regions, 8760 timestamps, hourly resolution
Simulation: Run on HPC, using standard nodes (104 cores, 250 GB)
Result: ShortfallSamples()

1000 MC Samples

threads Elapsed time (s) main Elapsed time (s) PR Max RSS (GB) main Max RSS (GB) PR
1 23.61 23.75 1.55 2.41
2 13.04 12.96 2.37 2.38
4 8.28 8.16 4.09 2.38
8 6.68 5.24 7.47 2.41
16 6.84 3.87 14.31 2.43
24 7.39 3.48 21.14 2.46
32 9.59 3.28 27.99 2.49
48 11.16 3.06 41.69 2.62
64 14.53 2.97 55.28 2.61
80 15.04 2.96 69.01 2.75
96 21.48 3.06 82.61 2.79

10000 MC Samples

threads Elapsed time (s) main Elapsed time (s) PR Max RSS (GB) main Max RSS (GB) PR
1 210.73 208.58 9.19 17.68
2 109.29 108.45 17.67 17.65
4 61.39 56.88 34.64 17.64
8 46.96 33.21 68.58 17.66
16 43.08 18.88 136.53 17.69
24 51.95 13.68 204.37 17.77

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 20, 2026

Codecov Report

❌ Patch coverage is 98.52941% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 82.22%. Comparing base (11d3003) to head (f6313c9).

Files with missing lines Patch % Lines
PRASCore.jl/src/Simulations/Simulations.jl 97.14% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #109      +/-   ##
==========================================
- Coverage   83.13%   82.22%   -0.92%     
==========================================
  Files          45       45              
  Lines        2325     2368      +43     
==========================================
+ Hits         1933     1947      +14     
- Misses        392      421      +29     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants