Skip to content

Simple cli wrapper#18

Open
hdpriest-ui wants to merge 6 commits intoccmmf:mainfrom
hdpriest-ui:simple-cli-wrapper
Open

Simple cli wrapper#18
hdpriest-ui wants to merge 6 commits intoccmmf:mainfrom
hdpriest-ui:simple-cli-wrapper

Conversation

@hdpriest-ui
Copy link
Contributor

relating to https://github.com/orgs/ccmmf/discussions/182

  1. magic-ensemble instantiated with verbs for various steps. Internally each verb binds directly to methods in the 2a_grass directory. These would be updated when we re-org the repository.
  2. user-facing configuration with parameters we expect users to change instantiated at example_user_config.yaml. allows users to specify external data resources, control use of apptainer, etc.
  3. non-user facing configuration for the workflow contained in workflow_manifest.yaml, which specifies the pipeline steps and their i/o. Users shouldn't mess with this, but maintainers would.
  4. added tool for patching the XML file with different dispatches as needed by pecan
  5. had to make minor changes to some of the actual 2a_grass scripts - @infotroph ; definitely check here to make sure I didn't break anything. This will have to stay in sync with the content of the data artifacts that we are handing over, and is a potential headache. Stabilizing both sides of this is a good idea at our collective earliest convenience.

This would be the example that would be adapted to fit downscaling, and other workflows as we go.

Note: currently the ccmmf compute nodes are not responsive, so this can only be seen in local execution mode. (login node is fine; Rob is working on the compute nodes)

…isplay run directory during dry-run and modify 00_fetch_s3_and_prepare_run_dir.sh to resolve and use absolute run directory for artifact downloads and extractions.
staging to workflow CLI

- magic-ensemble: --config is now required; supports use_apptainer (run
  prepare steps inside
  a container) and pecan_dispatch (select how ensemble members are
  submitted/executed)
- workflow_manifest.yaml: defines available dispatch modes
  (local-gnu-parallel, slurm-dispatch)
  with appropriate host XML for native and apptainer execution; S3
  resources consolidated
- Prep scripts: accept CLI flags instead of env vars; stage
  user-provided external files
  (e.g. template.xml) into the run directory before prepare steps run
- tools/patch_xml.py: utility to patch elements in PEcAn XML config
  files in-place
- 01_ERA5_nc_to_clim.R: ERA5 met inputs now looked up by grid cell
  center rather than site id
- example_user_config.yaml: documents new user-facing options
  (use_apptainer, pecan_dispatch,
  external_paths)

Relates to: https://github.com/orgs/ccmmf/discussions/182
Copy link
Contributor

@dlebauer dlebauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good step toward standardizing the workflow interfaces. The pattern of separating user-facing config from internal workflow details seems like the right direction.

A few things would help clarify the intent and direction. Not to bog down this PR - this works well as MVP proof of concept, but anything not addressed here should be captured in one or more follow up issues.

  1. In the past we've discussed separating data prep from the rest of the analysis workflows. Is that still a viable option, or is there a rationale for keeping data prep combined with ensembles?
  2. It's unclear why this lives under 2a_grass - what is the path from here to using this in the workflows that are our core deliverables?
  3. A README that explains the approach would make it easier to understand and adapt. Including the overall design, what is the role of the cli, config files, execution graph; boundaries between config files, manifest, and template.xml. What general patterns and specific components will be reused when adapting this to other workflows? This can wait until the next iteration, but wanted to make sure it is on the map.
  4. What is the plan for testing individual components and overall integration?

If this works now and is ready to implement, I'm good with the general pattern. My main question is how robust and extensible this will be. After implementing both the targets and custom workflows, do you have any insights on what to look for and how we would know if this gets to a level of complexity where we would consider a more standard workflow solution?

@@ -0,0 +1 @@

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are empty yml files for?

params_from_pft: "SLA,leafC"
additional_params: "varname=wood_carbon_fraction,distn=norm,parama=0.48,paramb=0.005"

# Steps per command: script path, R libs to check (empty for shell scripts), input/output path keys
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO putting 'steps' at the top of the file would make it easier to grok

# runs on the host so it can submit further jobs to Slurm.
run_run_ensembles() {
get_steps_array
check_aws
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is check_aws required for ensembles? shouldn't the data be staged at this point?


# --- Load effective config: manifest + optional user overrides ---
# User config may contain: run_dir, start_date, end_date, run_LAI_date, n_ens, n_met, ic_ensemble_size, n_workers, pecan_dispatch
get_val() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a --config argument, does this require that the config contain all values? i.e. will this fail if the config contains required values but relies on default values from L146-154?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from chatgpt:

get_val() currently treats every missing key in a provided config as fatal, so the defaults declared below are never actually used once --config is present. Because all config-backed values are resolved before subcommand dispatch, even get-demo-data now requires unrelated keys like n_workers, use_apptainer, and pecan_dispatch. This makes the config contract backward-incompatible as new fields are added.

fi

# Destination: copy into the run directory using the source basename.
dest="${RUN_DIR_ABS}/$(basename "$src")"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from chatgpt:

external_paths is parsed as a mapping, but the staging step ignores the destination key and always copies each file to run_dir/$(basename "$src"). That breaks the manifest contract: for example, external_paths.template_file: /tmp/my-template.xml gets staged as run_dir/my-template.xml, while patch_dispatch() later looks for run_dir/template.xml. The destination should be derived from the manifest path for the corresponding key, not from the source basename.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants