Simple cli wrapper by hdpriest-ui · Pull Request #18 · ccmmf/workflows

hdpriest-ui · 2026-03-18T20:01:27Z

relating to https://github.com/orgs/ccmmf/discussions/182

magic-ensemble instantiated with verbs for various steps. Internally each verb binds directly to methods in the 2a_grass directory. These would be updated when we re-org the repository.
user-facing configuration with parameters we expect users to change instantiated at example_user_config.yaml. allows users to specify external data resources, control use of apptainer, etc.
non-user facing configuration for the workflow contained in workflow_manifest.yaml, which specifies the pipeline steps and their i/o. Users shouldn't mess with this, but maintainers would.
added tool for patching the XML file with different dispatches as needed by pecan
had to make minor changes to some of the actual 2a_grass scripts - @infotroph ; definitely check here to make sure I didn't break anything. This will have to stay in sync with the content of the data artifacts that we are handing over, and is a potential headache. Stabilizing both sides of this is a good idea at our collective earliest convenience.

This would be the example that would be adapted to fit downscaling, and other workflows as we go.

Note: currently the ccmmf compute nodes are not responsive, so this can only be seen in local execution mode. (login node is fine; Rob is working on the compute nodes)

…hell

…isplay run directory during dry-run and modify 00_fetch_s3_and_prepare_run_dir.sh to resolve and use absolute run directory for artifact downloads and extractions.

staging to workflow CLI - magic-ensemble: --config is now required; supports use_apptainer (run prepare steps inside a container) and pecan_dispatch (select how ensemble members are submitted/executed) - workflow_manifest.yaml: defines available dispatch modes (local-gnu-parallel, slurm-dispatch) with appropriate host XML for native and apptainer execution; S3 resources consolidated - Prep scripts: accept CLI flags instead of env vars; stage user-provided external files (e.g. template.xml) into the run directory before prepare steps run - tools/patch_xml.py: utility to patch elements in PEcAn XML config files in-place - 01_ERA5_nc_to_clim.R: ERA5 met inputs now looked up by grid cell center rather than site id - example_user_config.yaml: documents new user-facing options (use_apptainer, pecan_dispatch, external_paths) Relates to: https://github.com/orgs/ccmmf/discussions/182

dlebauer

This looks like a good step toward standardizing the workflow interfaces. The pattern of separating user-facing config from internal workflow details seems like the right direction.

A few things would help clarify the intent and direction. Not to bog down this PR - this works well as MVP proof of concept, but anything not addressed here should be captured in one or more follow up issues.

In the past we've discussed separating data prep from the rest of the analysis workflows. Is that still a viable option, or is there a rationale for keeping data prep combined with ensembles?
It's unclear why this lives under 2a_grass - what is the path from here to using this in the workflows that are our core deliverables?
A README that explains the approach would make it easier to understand and adapt. Including the overall design, what is the role of the cli, config files, execution graph; boundaries between config files, manifest, and template.xml. What general patterns and specific components will be reused when adapting this to other workflows? This can wait until the next iteration, but wanted to make sure it is on the map.
What is the plan for testing individual components and overall integration?

If this works now and is ready to implement, I'm good with the general pattern. My main question is how robust and extensible this will be. After implementing both the targets and custom workflows, do you have any insights on what to look for and how we would know if this gets to a level of complexity where we would consider a more standard workflow solution?

dlebauer · 2026-03-20T22:19:30Z

.github/workflows/apptainer-sipnet-carb.yml

@@ -0,0 +1 @@
+


what are empty yml files for?

dlebauer · 2026-03-21T03:27:34Z

2a_grass/workflow_manifest.yaml

+params_from_pft: "SLA,leafC"
+additional_params: "varname=wood_carbon_fraction,distn=norm,parama=0.48,paramb=0.005"
+
+# Steps per command: script path, R libs to check (empty for shell scripts), input/output path keys


IMHO putting 'steps' at the top of the file would make it easier to grok

dlebauer · 2026-03-21T03:57:14Z

magic-ensemble

+# runs on the host so it can submit further jobs to Slurm.
+run_run_ensembles() {
+  get_steps_array
+  check_aws


is check_aws required for ensembles? shouldn't the data be staged at this point?

dlebauer · 2026-03-21T04:14:31Z

magic-ensemble

+
+# --- Load effective config: manifest + optional user overrides ---
+# User config may contain: run_dir, start_date, end_date, run_LAI_date, n_ens, n_met, ic_ensemble_size, n_workers, pecan_dispatch
+get_val() {


If there is a --config argument, does this require that the config contain all values? i.e. will this fail if the config contains required values but relies on default values from L146-154?

from chatgpt:

get_val() currently treats every missing key in a provided config as fatal, so the defaults declared below are never actually used once --config is present. Because all config-backed values are resolved before subcommand dispatch, even get-demo-data now requires unrelated keys like n_workers, use_apptainer, and pecan_dispatch. This makes the config contract backward-incompatible as new fields are added.

dlebauer · 2026-03-21T04:19:24Z

2a_grass/00_stage_external_inputs.sh

+  fi
+
+  # Destination: copy into the run directory using the source basename.
+  dest="${RUN_DIR_ABS}/$(basename "$src")"


from chatgpt:

external_paths is parsed as a mapping, but the staging step ignores the destination key and always copies each file to run_dir/$(basename "$src"). That breaks the manifest contract: for example, external_paths.template_file: /tmp/my-template.xml gets staged as run_dir/my-template.xml, while patch_dispatch() later looks for run_dir/template.xml. The destination should be derived from the manifest path for the corresponding key, not from the source basename.

hdpriest-ui added 6 commits October 9, 2025 13:43

Create apptainer-sipnet-carb.yml

bb24060

Merge branch 'ccmmf:main' into main

bc10cea

Create run-workflow-examples.yml

83a23da

Add first iteration of workflow CLI with config files and data prep s…

c0f8255

…hell

Enhance run directory handling in scripts: update magic-ensemble to d…

f6f0f7b

…isplay run directory during dry-run and modify 00_fetch_s3_and_prepare_run_dir.sh to resolve and use absolute run directory for artifact downloads and extractions.

hdpriest-ui requested review from dlebauer and infotroph March 18, 2026 20:01

dlebauer reviewed Mar 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simple cli wrapper#18

Simple cli wrapper#18
hdpriest-ui wants to merge 6 commits intoccmmf:mainfrom
hdpriest-ui:simple-cli-wrapper

hdpriest-ui commented Mar 18, 2026

Uh oh!

dlebauer left a comment

Uh oh!

dlebauer Mar 20, 2026

Uh oh!

dlebauer Mar 21, 2026

Uh oh!

dlebauer Mar 21, 2026

Uh oh!

dlebauer Mar 21, 2026

Uh oh!

dlebauer Mar 21, 2026

Uh oh!

dlebauer Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hdpriest-ui commented Mar 18, 2026

Uh oh!

dlebauer left a comment

Choose a reason for hiding this comment

Uh oh!

dlebauer Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

dlebauer Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

dlebauer Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

dlebauer Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

dlebauer Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

dlebauer Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants