Skip to content

Parallel verbosity#1813

Merged
nfahlgren merged 31 commits intov5.0from
parallel-verbosity
Jan 30, 2026
Merged

Parallel verbosity#1813
nfahlgren merged 31 commits intov5.0from
parallel-verbosity

Conversation

@joshqsumner
Copy link
Copy Markdown
Contributor

Describe your changes
Added a verbosity flag to WorkflowConfig and jupyterconfig and changed both classes to have a general setter function that prints messages about what each field is going to do when it is updated if verbose=True. Also uses a print wrapper in several places to respect the verbose flag.

Type of update
This is a feature enhancement.

Associated issues
None

Additional context
Meant to work in conjunction with a new tutorial based on a notebook from @annacasto . That is not added to the website yet but does have a v5 branch.

For the reviewer
See this page for instructions on how to review the pull request.

  • PR functionality reviewed in a Jupyter Notebook
  • All tests pass
  • Test coverage remains 100%
  • Documentation tested
  • New documentation pages added to plantcv/mkdocs.yml
  • Changes to function input/output signatures added to updating.md
  • Code reviewed
  • PR approved

@deepsource-io
Copy link
Copy Markdown

deepsource-io Bot commented Oct 9, 2025

Here's the code health analysis summary for commits 31f65fe..daa32f5. View details on DeepSource ↗.

Analysis Summary

AnalyzerStatusSummaryLink
DeepSource Python LogoPython✅ SuccessView Check ↗
DeepSource Test coverage LogoTest coverage✅ SuccessView Check ↗

Code Coverage Report

MetricAggregatePython
Branch Coverage100%100%
Composite Coverage100%100%
Line Coverage100%100%
New Branch Coverage100%100%
New Composite Coverage100%100%
New Line Coverage100%, ✅ Above Threshold100%, ✅ Above Threshold

💡 If you’re a repository administrator, you can configure the quality gates from the settings.

@joshqsumner joshqsumner added enhancement Enhancements to existing features work in progress Mark work in progress merge in order Set of PRs that need to be merged sequentially labels Oct 9, 2025
@joshqsumner joshqsumner added ready to review and removed work in progress Mark work in progress labels Oct 14, 2025
@joshqsumner joshqsumner added this to the PlantCV v5.0 milestone Oct 24, 2025
@joshqsumner
Copy link
Copy Markdown
Contributor Author

@maliagehan had some good points about it being likely that some of the new parallelization features increase the likelihood that people just blast their computer asking for more cores than they have.

Initially I thought that I could just add to the existing messages that the config prints in verbose mode, but accessing the dictionary items in config.cluster_config circumvents config.__setattr__ so it is not obvious to me how to validate those values reactively without making another class for the config.cluster_config dictionary. That isn't the worst thing to do probably, but it felt like a bit of overkill right now.

Instead I added a check to config.validate_config() that looks at n_workers and cores comparing them to os.cpu_count() if config.cluster=="LocalCluster". If someone is using a non-Local Cluster then we could get cpu counts with dask after the client is made, but that is pretty involved and would require a lot of shuffling pieces around.

If we wanted to have another class I think it could be a simple dictionary wrapper with a setter something like this but it might require playing with how dictionary keys are set vs how attributes are set, I'm not sure about that all.

def cluster_config_dict():
    """Whatever else this does"""
    def __init__(self):
        dict(...)
    def __setattr__(self, name, value):
        _cluster_config_attr_lookup(self, name, value)
        object.__setattr__(self, name, value)

def _validate_set_cluster_field(config, val, sentence, expect_type, field):
    """Validate cluster configuration settings before setting them in a WorkflowConfig or jupyterconfig object

    Parameters
    ----------
    config   = WorkflowConfig or JupyterConfig object
    val         = type flexible, value to assign to configuration attribute.
    sentence    = str, unformatted string to mention what the change will do.
    expect_type = type, expected type for val.
    attr        = str, attribute name

    Returns
    -------
    out         = minimal message about what the change does.

    Raises
    ------
    ValueError if incompatible type between input val and expected type for attribute.

    """
    if not isinstance(val, expect_type):
        raise ValueError("Expected " + expect_type.__name__ + ", got " + type(val).__name__)
    elif isinstance(val, dict):
        form = ", ".join(val.keys())
    else:
        form = val
    # parse form into sentence string
    out = sentence.format(form)
    if attr == "n_workers" and config.cluster=="LocalCluster":
        available = os.cpu_count()
        out = out + ", this machine has {} cores".format(available)
    if attr == "n_cores":
        total_cores = val * config.cluster_config["n_workers"]
        out = out + " combined with n_workers this will require {} total cores".format(total_cores)
    return out


def _cluster_config_attr_lookup(config, field, val):
    """Lookup fields of cluster configuration in a WorkflowConfig or jupyterconfig object

    Parameters
    ----------
    config   = WorkflowConfig or JupyterConfig object
    field    = str, name of an cluster_config field to set.
    val      = type flexible, value to assign to cluster_config key

    Returns
    -------
    message  = minimal message about what the change does.

    Raises
    ------
    ValueError if incompatible type between input val and expected type for attribute.
    """
    cluster_config_control = {
            "n_workers": ["{} Jobs will run in parallel, requires at least {} cores", int],
            "cores": ["Each parallel job will use {} cores, this should normally be left as 1", int],
            "memory": ["Each job will have {} memory", str],
            "disk": ["Each job will have {} disk memory", str],
            "log_directory": ["Job logs will be written to {}", str],
            "local_directory": ["{} will be treated as dask's working directory", str],
            "job_extra_directives": ["Specifying extra job directives", dict]
        }
    message = _validate_set_cluster_field(
        config=config,
        val=val,
        sentence=cluster_config_control[field][0],
        expect_type=cluster_config_control[field][1],
        field=field
    )
    return message

@nfahlgren nfahlgren merged commit 286864c into v5.0 Jan 30, 2026
5 checks passed
@nfahlgren nfahlgren deleted the parallel-verbosity branch January 30, 2026 22:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Enhancements to existing features merge in order Set of PRs that need to be merged sequentially ready to review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants