Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 12 additions & 1 deletion _data/speakers.yml
Original file line number Diff line number Diff line change
Expand Up @@ -138,4 +138,15 @@
talk_num: 9
photo: souza.jpg
bio: "Renan Souza earned his Ph.D., M.Sc., and B.Sc. in Computer Science (2009-2019) from the Federal University of Rio de Janeiro (UFRJ). Since 2022, he has been a researcher and software engineer at Oak Ridge National Laboratory, after spending seven years at IBM. He was a visiting scientist at INRIA, France, during his Ph.D. and, during his B.Sc., studied abroad at Missouri State University and interned at SLAC National Laboratory. Active in engineering, research, and technical leadership since 2010, he has authored 50+ peer-reviewed papers in leading venues and holds 10+ USPTO patents. His current focus is on designing and building scalable systems to support responsible and trustworthy AI workflows."


- name: Jan Janssen
role: Group Leader for Materials Informatics
institution:
- name: Max Planck Institute for Sustainable Materials
link: https://www.mpie.de/en
image: max_plank.png
country: de
link: https://www.mpie.de/4910750/Janssen
talk_num: 10
photo: janssen.jpg
bio: "Jan Janssen is the group leader for Materials Informatics at the Max Planck Institute for Sustainable Materials. His group focuses on applying methods from computer science including machine learning to discover novel sustainable materials with applications ranging from machine-learned interatomic potentials to large language model agents for atomistic simulation. Previously, Jan was a director’s postdoctoral fellow in the T-division at Los Alamos National Laboratory as part of the Exascale Computing Project as well as an invited postdoctoral fellow at the University of Chicago and the University of California Los Angeles. Besides his research work, Jan is the lead developer of the pyiron atomistic simulation suite, maintains over 1000 open-source materials informatics software packages for the conda-forge community and is a regular contributor to open-source software on Github."
54 changes: 54 additions & 0 deletions _talks/2026_01_14.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
---
layout: talk
title: "Up-scaling Python functions for HPC with executorlib"
authors: Jan Janssen (Max Planck Institute for Sustainable Materials)
event_date: January 14, 2026
times: 11:00-11:30 PST / 14:00-14:30 EST / 20:00-20:30 CEST
talk_number: 10
given: false
<!-- image: /images/talks/janssen-banner.png -->
<!-- presentation: -->
<!-- video: -->
---

The up-scaling of Python workflows from the execution on a local workstation to
the parallel execution on an HPC typically faces three challenges: (1) the
management of inter-process communication, (2) the data storage and (3) the
management of task dependencies during the execution. These challenges commonly
lead to a rewrite of major parts of the reference serial Python workflow to
improve computational efficiency. Executorlib addresses these challenges by
extending Python's ProcessPoolExecutor interface to distribute Python functions
on HPC systems. It interfaces with the job scheduler directly without the need
for a database or daemon process, leading to seamless up-scaling.

<br /><br />

The presentation introduces the challenge of up-scaling Python workflows. It
highlights how executorlib extends the ProcessPoolExecutor interface of the
Python standard library to provide the user with a familiar interface, while
the executorlib backend directly connects to the HPC job scheduler to distribute
Python functions either from the login node to individual compute nodes or
within an HPC allocation of a number of compute nodes, which is enabled by
supporting both file-based and socket-based communication.

<br /><br />

The setup of executorlib on different HPC systems is introduced, based on the
current support for the SLURM job scheduler as well as the Flux framework to
enable hierarchical scheduling within large HPC job allocations as commonly
used on Exascale computers. Application examples are then given to demonstrate
how executorlib supports the assignment of computational resources like CPU
cores, number of threads and GPU resources on a per-function basis, including
support for MPI, which drastically simplifies the process of up-scaling Python
workflows.

<br /><br />

In this context, the focus of this presentation is the user journey during the
up-scaling of a Python workflow and how features like caching or the integrated
debugging capabilities for the distributed execution of Python functions
accelerate the development cycle. The presentation concludes by returning to
challenges identified as part of DOE Exascale Computing Project's EXAALT effort
to demonstrate how the development process was drastically simplified by using
executorlib, with a specific focus on dynamic dependencies which are only
resolved during run time of the Python workflow.
Binary file added images/institutions/max_plank.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/talks/janssen.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.