Skip to content

perf: use chunks="auto" in a single thread#2137

Merged
coroa merged 4 commits intomasterfrom
perf/atlite-auto-chunks
Apr 8, 2026
Merged

perf: use chunks="auto" in a single thread#2137
coroa merged 4 commits intomasterfrom
perf/atlite-auto-chunks

Conversation

@coroa
Copy link
Copy Markdown
Member

@coroa coroa commented Apr 7, 2026

Changes proposed in this Pull Request

Alternative to #2135 and #2136 , that simply relies xarray's automatic chunk selection which falls back to the netcdf stored chunks and picks nprocesses=1 in the default xarray backend to clock in similar speed improvements as #2135 .

on master:

❯ time pixi run snakemake prepare_sector_networks -c16 -R build_renewable_profiles determine_availability_matrix --until build_renewable_profiles
[...]
________________________________________________________
Executed in   23.60 mins    fish           external
   usr time  160.59 mins    0.13 millis  160.59 mins
   sys time   38.55 mins    1.06 millis   38.55 mins

on this perf branch:

❯ time pixi run snakemake prepare_sector_networks -c16 -R build_renewable_profiles determine_availability_matrix --until build_renewable_profiles
[...]
________________________________________________________
Executed in  307.62 secs    fish           external
   usr time   20.88 mins  670.00 micros   20.88 mins
   sys time    3.67 mins  108.00 micros    3.67 mins

ie. 24min reduces to 5min (out of which 21s are snakemake's dag generation).

Checklist

Required:

  • Changes are tested locally and behave as expected.
  • Code and workflow changes are documented.
  • A release note entry is added to doc/release_notes.rst.

If applicable:

  • Changes in configuration options are reflected in scripts/lib/validation.
  • For new data sources or versions, these instructions have been followed.
  • New rules are documented in the appropriate doc/*.rst files.

Copy link
Copy Markdown
Member

@fneum fneum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great!

@FabianHofmann
Copy link
Copy Markdown
Contributor

very good! we should also check this for higher resolution and I would vote for removing the spoiling logs, they don't make much sense with parallel and fast execution.

#2135 is still a bit faster with 248 secs but I am fine with dropping this for the sake of less code in atlite.

@coroa
Copy link
Copy Markdown
Member Author

coroa commented Apr 8, 2026

very good! we should also check this for higher resolution and I would vote for removing the spoiling logs, they don't make much sense with parallel and fast execution.

#2135 is still a bit faster with 248 secs but I am fine with dropping this for the sake of less code in atlite.

It's quite possible that my computer is slower. My master run also took a couple more minutes :)

@FabianHofmann
Copy link
Copy Markdown
Contributor

I guess its mostly because of the the raster cache that I implemented in the atlite branch which makes matrix calculations ~70% faster. but this is nothing urgent now.

@coroa
Copy link
Copy Markdown
Member Author

coroa commented Apr 8, 2026

For more clusters:

❯ tail benchmarks/build_renewable_profile_*_solar-hsat
==> benchmarks/build_renewable_profile_50_solar-hsat <==
s	h:m:s	max_rss	max_vms	max_uss	max_pss	io_in	io_out	mean_load	cpu_time
142.73	0:02:22	4221.91	5634.10	3977.55	4023.49	8.85	9.91	119.10	181.80

==> benchmarks/build_renewable_profile_512_solar-hsat <==
s	h:m:s	max_rss	max_vms	max_uss	max_pss	io_in	io_out	mean_load	cpu_time
149.46	0:02:29	4386.59	5971.96	4120.20	4166.49	0.00	10.33	108.45	173.20

==> benchmarks/build_renewable_profile_1024_solar-hsat <==
s	h:m:s	max_rss	max_vms	max_uss	max_pss	io_in	io_out	mean_load	cpu_time
157.58	0:02:37	5809.50	7408.45	5643.00	5689.21	0.00	10.62	95.37	161.25

memory consumption is quite manageable even for 1024 clusters (~ 6 GB for solar-hsat) and the higher number of clusters does not affect runtime negatively.

Runtime statistics for 512 clusters:

On master:

❯ time pixi run snakemake prepare_sector_networks -c16 -R build_renewable_profiles determine_availability_matrix --until build_renewable_profiles
[...]
________________________________________________________
Executed in   21.69 mins    fish           external
   usr time  148.09 mins    0.14 millis  148.09 mins
   sys time   36.41 mins    1.08 millis   36.41 mins

On this branch:

❯ time pixi run snakemake prepare_sector_networks -c16 -R build_renewable_profiles determine_availability_matrix --until build_renewable_profiles
[...]
Executed in  302.73 secs    fish           external
   usr time   20.86 mins    0.00 micros   20.86 mins
   sys time    2.88 mins  867.00 micros    2.88 mins

@FabianHofmann
Copy link
Copy Markdown
Contributor

I would say, let's merge this in

@coroa coroa enabled auto-merge (squash) April 8, 2026 14:30
@coroa
Copy link
Copy Markdown
Member Author

coroa commented Apr 8, 2026

Not sure what this failing pypsa-app:config.validator.yaml action is? @lkstrp

@coroa coroa merged commit c8fd89c into master Apr 8, 2026
9 of 10 checks passed
@coroa coroa deleted the perf/atlite-auto-chunks branch April 8, 2026 14:49
@lkstrp
Copy link
Copy Markdown
Member

lkstrp commented Apr 8, 2026

Not sure what this failing pypsa-app:config.validator.yaml action is? @lkstrp

Yeah just ignore, you don't have access yet. I think it's a stability problem on the cluster and nothing in here, but need to check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants