All notable changes to Merlin will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Added support for Python 3.11
- Update docker docs for new rabbitmq and redis server versions
- Added lgtm.com Badge for README.md
- More fixes for lgtm checks.
- Added merlin server command as a container option for broker and results_backend servers.
- Added new documentation for merlin server in docs and tutorial
- Added the flux_exec batch argument to allow for flux exec arguments, e.g. flux_exec: flux exec -r "0-1" to run celery workers only on ranks 0 and 1 of a multi-rank allocation
- Additional argument in test definitions to allow for a post "cleanup" command
- Capability for non-user block in yaml
- .readthedocs.yaml and requirements.txt files for docs
- Small modifications to the Tutorial, Getting Started, Command Line, and Contributing pages in the docs
- Compatibility with the newest version of Maestro (v. 1.1.9dev1)
- JSON schema validation for Merlin spec files
- New tests related to JSON schema validation
- Instructions in the "Contributing" page of the docs on how to add new blocks/fields to the spec file
- Brief explanation of the $(LAUNCHER) variable in the "Variables" page of the docs
- Now loads np.arrays of dtype='object'
- Removed support for Python 3.6
- Rename lgtm.yml to .lgtm.yml
- New shortcuts in specification file (sample_vector, sample_names, spec_original_template, spec_executed_run, spec_archived_copy)
- Update requirements to require redis 4.3.4 for acl user channel support
- Added ssl to the broker and results backend server checks when "merlin info" is called
- Removed theme_override.css from docs/_static/ since it is no longer needed with the updated version of sphinx
- Updated docs/Makefile to include a pip install for requirements and a clean command
- Update to the Tutorial and Contributing pages in the docs
- Changed what is stored in a Merlin DAG
- We no longer store the entire Maestro ExecutionGraph object
- We now only store the adjacency table and values obtained from the ExecutionGraph object
- Modified spec verification
- Update to require maestrowf 1.9.1dev1 or later
- Fixed return values from scripts with main() to fix testing errors.
- CI test for CHANGELOG modifcations
- Typo "cert_req" to "cert_reqs" in the merlin config docs
- Removed emoji from issue templates that were breaking doc builds
- Including .temp template files in MANIFEST
- Styling in the footer for docs
- Horizontal scroll overlap in the variables page of the docs
- Reordered small part of Workflow Specification page in the docs in order to put "samples" back in the merlin block
- Code updates to satisfy lgtm CI security checker
- A bug in the ssl config was not returning the proper values
- Auto-release of pypi packages
- Workflows Community Initiative metadata file
- Old references to stale branches
- Test for
merlin example list - Python 3.10 to testing
- The Optimization workflow example now has a ready to use workflow (
optimization_basic.yaml). This solves the issue faced before withmerlin example list. - Redis dependency handled implictly by celery for cross-compatibility
- Re-enabled distributed integration testing. Added additional examination to distributed testing.
- 'shell' added to unsupported and new_unsupported lists in script_adapter.py, prevents
'shell' is not supported -- ommittedmessage. - Makefile target for install-merlin fixed so venv is properly activated to install merlin
- Updated the optimization workflow example with a new python template editor script
- CI now splits linting and testing into different tasks for better utilization of parallel runners, significant and scalable speed gain over previous setup
- CI now uses caching to restore environment of dependencies, reducing CI runtime significantly again beyond the previous improvement. Examines for potential updates to dependencies so the environment doesn't become stale.
- CI now examines that the CHANGELOG is updated on PRs.
- Added PyLint pipeline to Github Actions CI (currently no-fail-exit).
- Corrected integration test for dependency to only examine release dependencies.
- PyLint adherence for: celery.py, opennplib.py, config/init.py, broker.py, configfile.py, formatter.py, main.py, router.py
- Integrated Black and isort into CI
- merlin purge queue name conflict & shell quote escape
- task priority support for amqp, amqps, rediss, redis+socket brokers
- Flake8 compliance
retry_delayfield in a step to specify a countdown in seconds prior to running a restart or retry.- New merlin example
restart_delaythat demonstrates usage of this feature. - Condition failure reporting, to give greater insight into what caused test failure.
- New fields in config file:
celery.omit_queue_tagandcelery.queue_tag, for users who wish to have complete control over their queue names. This is a feature of the task priority change.
feature_demonow usesmerlin-spellbookinstead of its own scripts.- Removed the
--mpi=nonesrundefault launch argument. This can be added by setting thelaunch_argsargument in the batch section in the spec. - Merlin CI is now handled by Github Actions.
- Certain tests and source code have been refactored to abide by Flake8 conventions.
- Reorganized the
testsmodule. Madeunitdir alongsideintegrationdir. Decomposedrun_tests.pyinto 3 files with distinct responsibilities. RenamedConditionclasses. Grouped cli tests by sub-category for easier developer interpretation. - Lowered the command line test log level to "ERROR" to reduce spam in
--verbosemode. - Now prioritizing workflow tasks over task-expansion tasks, enabling improved scalability and server stability.
- Flake8 examination slightly modified for more generous cyclomatic complexity.
- Certain tests and source code have been refactored to abide by Flake8 conventions.
walltimecan be specified in any of hours:minutes:seconds, minutes:seconds or seconds format and will be correctly translated for the right batch system syntax
- For Celery calls, explictly wrapped queue string in quotes for robustness. This fixes a bug that occurred on tsch but not bash in which square brackets in the queue name were misinterpreted and caused the command to break.
- Bug that caused steps to raise a fatal error (instead of soft failing) after maxing out step retries. Occurred if steps were part of a chord.
- Bug that caused step restarts to lose alternate shell specification, and
associated CLI
restart_shelltest.
- Bug that caused example workflows with a variable reference in their
name to be listed by
merlin example listwith variable reference notation. - Bug that caused
requirements.txtfiles to be excluded from generatedmerlin exampledirs. - Bug that causes step restarts to lose alternate shell specification. Also added CLI test for this case.
- Default broker server password is now
jackalope-password, sincerabbitis currently accessed by developers only.
- The first version of an optimization workflow, which can be accessed with
merlin example optimization. - Dev requirement library for finding dependencies (and
make reqlisttarget)
- Improved warning and help messages about
no_errors
- Pinned to celery>5.0.3 so that
merlin purgeworks again
- Now requiring Celery version 5.x.
- Further improvements to the
null_specexample.
- Users will no longer see the message, "Cannot set the submission time of '' because it has already been set", when tasks are restarted.
- Bug causing
merlin restartto break.
- Improved internal logic beyond the crude fixes of the prior 2 patches.
- Added a developer cli test for the minimum valid spec format.
- Improvements to the
null_specexample, used for measuring overhead in merlin. Includes a newnull_chainand removes the now-redundantsim_spec.
- Completed 1.7.2 fix for
merlin run-workers.
- Fatal bug triggered by a spec missing the
envorglobal.parameterssections.
- When using the
--samplesfileflag, the samples file is now copied tomerlin_infofor provenance.
- Exceptions in connection check sub-process will now be caught.
- The ability to override any value of the celery configuration thru
app.yamlincelery.override. - Support and faq entry for
pgenwithmerlin run --pgenand optional--parg. - Documentation on
level_max_dirs. - Easier-to-read provenance specs.
- Documentation on the new 3 types of provenance spec.
- Flux test example data collection for new versions of flux.
- Fixed Docker ubuntu version.
- Removed expansion of env variables in shell sections (
cmdandrestart) of provenance specs. This allows the shell command itself to expand environment variables, and gives users greater flexibility. - Allowed environment variables to be properly expanded in study
description.name. - Tilde (~) now properly expands as part of a path in non-shell sections.
- The rediss cert_reqs keyword was changed to ssl_cert_reqs.
- Updated tutorial redis version to 6.0.5.
- The sample generation command now logs
stdout,stderr, andcmd.shtomerlin_info/. - 12 hidden test specs and associated cli tests, for cli tests with specs that we
do not want in
merlin examples. - Inside
merlin_info/, added provenance specs<name>.orig.yaml,<name>.expanded.yaml, and<name>.partial.yaml(identical to the original spec, but with expanded user variables).
- Updated to new celery (4.4.5) syntax for signature return codes.
- Corrected prior visibility timeout bugfix.
- Fixed and reactivated 3 cli tests.
- Added the
bankandwalltimekeywords to the batch slurm launch, these will not alter the lsf launch.
- Slightly improved logic by using regex to match variable tokens.
- Reduced instances of I/O,
MerlinStudylogic is now in-memory to a greater extent.
- Error if app.yaml does not have visibility timeout seconds.
- The broker name can now be amqps (with ssl) or amqp (without ssl).
- The encryption key will now be created when running merlin config.
- The merlin info connection check will now enforce a minute timeout check for the server connections.
- Added a check for initial running workers when using merlin monitor to eliminate race condition.
- A bug that did not change the filename of the output workspace nor of the provenance spec
when a user variable was included in the
description.namefield. - Temporarily locked Celery version at 4.4.2 to avoid fatal bug.
- The default rabbitmq vhost is now instead of /.
- Changed default visibility timeout from 2 hours to 24 hours. Exposed this in the config file.
- The monitor function will now check the queues to determine when to exit.
- Temporarily locked maestro version to avoid fatal bug introduced by maestro v1.1.7.
- A faq entry for --mpibind when using slurm on LC.
- Version of the openfoam workflow that works without docker.
- In 'merlin examples', a version of the openfoam workflow that works without docker.
- The batch system will now check LSB_MCPU_HOSTS to determine the number of nodes on blueos systems in case LSB_HOSTS is not present.
- A few typos, partially finished material, and developer comments in the tutorials.
- PEP-8 violations like unused imports, bad formatting, broken code.
- A bug where the batch shell was not overriding the default.
- Removed mysql dependencies and added sqlalchemy to the celery module.
- Removed mysql install from travis.
- Improved the celery worker error messages.
- The slurm launch for celery workers no longer uses the --pty option, this can be added by setting launch_args in the batch section.
- Adjusted wording in openfoam_wf_no_docker example.
- merlin example
null_spec, which may be used to generate overhead data for merlin.
- The task creation bottleneck.
- Bug that caused the
cmdstdout and stderr files of a step to be overwritten by that same step'srestartsection.
- Updated tutorial docs.
- Relocated code that ran upon import from file body to functions. Added the respective function calls.
HelpParser, which automatically prints help messages when command line commands return an error.- Optional ssl files for the broker and results backend config.
- A url keyword in the app.yaml file to override the entire broker or results backend configuration.
- The
alloption tobatch.nodes. - Auto zero-padding of sample directories, e.g. 00/00, 00/01 .. 10/10
$(MERLIN_STOP_WORKERS)exit code that shuts down all workers- The
merlin monitorcommand, which blocks while celery workers are running. This can be used at the end of a batch submission script to keep the allocation alive while workers are present. - The ~/.merlin dir will be searched for the results password.
- A warning whenever an unrecognized key is found in a Merlin spec; this may help users find small mistakes due to spelling or indentation more quickly.
- Bug that prevented an empty username for results backend and broker when using redis.
- Bug that prevented
OUTPUT_PATHfrom being an integer. - Slow sample speed in
hello_samples.yamlfrom the hello example. - Bug that always had sample directory tree start with "0"
- "Error" message whenever a non-zero return code is given
- The explicit results password (when not a file) will be read if certs path is None and it will be stripped of any whitespace.
- Misleading log message for
merlin run-workers --echo. - A few seconds of lag that occurred in all merlin cli commands; merlin was always searching thru workflow examples, even when user's command had nothing to do with workflow examples.
- Updated docs from
pip3 install merlinwftopip3 install merlin. - Script launching uses Merlin submission instead of subclassing maestro submit
$(MERLIN_HARD_FAIL)now shuts down only workers connected to the bad step's queue- Updated all tutorial modules
- Updated various aspects of tutorial documentation.
- The walltime keyword is now enabled for the slurm and flux batch types.
- LAUNCHER keywords, (slurm,flux,lsf) for specifying arguments specific to that parallel launcher in the run section.
- Exception messages to
merlin info. - Preliminary tutorial modules for early testers.
- The problematic step
stop_workersinfeature_demo.yaml.
- Syntax errors in web doc file
merlin_variables.rst.
- The exclusive and signal keywords and bind for slurm in a step. The bind keyword is now lsf only.
- cli test flag
--local, which can be used in place of listing out the id of each local cli test. - A Merlin Dockerfile and some accompanying web documentation.
- Makefile target
release. - The merlin config now takes an optional --broker argument, the value can be None, default rabbitmq broker, or redis for a redis local broker.
- Missing doc options for run and run-workers.
- Check server access when
merlin infois run. - A port option to rabbitmq config options.
- Author and author_email to setup.py.
- Makefile targets
pullandupdate. - Unneeded variables from
simple_chain.yaml. - All
INFO-level logger references to Celery.
- Updated the Merlin Sphinx web docs, including info about command line commands.
- Example workflows use python3 instead of python.
- Updated
merlin infoto lookup python3 and and pip3. - Altered user in Dockerfile and removed build tools.
- MANIFEST.in now uses recursive-include.
- Updated docker docs.
make cleanis more comprehensive, now cleans docs, build files, and release files.- The celery keyword is no longer required in
app.yaml.
- Adjusted
maestrowfrequirement tomaestrowf>=1.1.7dev0.
- Unused directory
workflows/at the top level (not to be confused withmerlin/examples/workflows/)
- Bug related to missing package
merlin/examples/workflowsin PYPI distribution.
- Bug related to a missing path in
MANIFEST.in. - Error message when trying to run merlin without the app config file.
version_tests.sh, for CI checking that the merlin version is incremented before changes are merged into master.- Allow for the maestro
$(LAUNCHER)syntax in tasks, this requires the nodes and procs variables in the task just as in maestro. The LAUNCHER keyword is implemented for flux, lsf, slurm and local types. The lsf type will use the LLNL srun wrapper for jsrun when the lsf-srun batch type is used. The flux version will be checked to determine the proper format of the parallel launch call. - Local CLI tests for the above
$(LAUNCHER)feature. machineskeyword, in themerlin.resources.workers.<name>section. This allows the user to assign workers (and thence, steps) to a given machine. All of the machines must have access to theOUTPUT_PATH, The steps list is mandatory for all workers. Once the machines are added, then only the workers for the given set of steps on the specific machine will start. The workers must be individually started on all of the listed machines separately by the user (merlin run-workers).- New step field
restart. This command runs when merlin receives a$(MERLIN_RESTART)exception. If norestartfield is found, thecmdcommand re-runs instead.
- A bug in the
flux_testexample workflow.
- Improved the
fix-styledev Makefile target. - Improved the
versiondev Makefile target. - Updated travis logic.
MERLIN_RESTART(which re-ran thecmdof a step) has been renamed toMERLIN_RETRY.
- Makefile target
versionfor devs to auto-increment the version.
- Development dependencies install via pip:
pip install "merlinwf[dev]". merlin status <yaml spec>that returns queues, number of connected workers and number of unused tasks in each of those queues.merlin examplecli command, which allows users to start running the examples immediately (even after pip-installing).
MANIFEST.infixes as required by Spack.requirements.txtjust has release components, not dev deps.- A bug related to the deprecated word 'unicode' in
openfilelist.py. - Broken Merlin logo image on PyPI summary page.
- Documentation typos.
- Made
README.mdmore concise and user-friendly.
- Dependencies outside the requirements directory.
- LLNL-specific material in the Makefile.
merlin-templatescli command, in favor ofmerlin example.
- Change the form of the maestrowf git requirement.
requirements.txtfiles andscriptsdirectories for internal workflow examples.
- Added missing dependency
tabulatetorelease.txt.
Added the requirements files to the MANIFEST.in file for source distributions.
Negligible changes related to PyPI.
Negligible changes related to PyPI.
First public release. See the docs and merlin -h for details. Here are some highlights.
- A changelog!
- Templated workflows generator. See
merlin-templates - Steps in any language. Set with study.step.run.shell in spec file. For instance:
- name: python2_hello
description: |
do something in python2
run:
cmd: |
print "OMG is this in python2?"
print "Variable X2 is $(X2)"
shell: /usr/bin/env python2
task_queue: pyth2_hello
- Integration testing
make cli-tests - Style rules (isort and black). See
make check-styleandmake fix-style - Dry-run ability for workflows, which will cause workers to setup workspaces,
but skip execution (all variables will be expanded). Eg:
merlin run --local --dry-run. - Command line override of variable names.
merlin run --vars OUTPUT_PATH=/run/here/instead
- MerlinSpec class subclasses from Maestro
merlin kill-workers. Usemerlin stop-workers
- Dependencies on optional tools
- Unused fields in example workflows
- Multi-type samples (eg strings as well as floats)
- Single sample and single feature samples
- Auto-encryption of backend traffic