Skip to content

Add Al-Zn-Cu-Mg metallurgy regression benchmark and metrics#569

Open
daniel-sintef wants to merge 11 commits into
ddmms:mainfrom
DanielMarchand:add-alzncumg-metallurgy-tests
Open

Add Al-Zn-Cu-Mg metallurgy regression benchmark and metrics#569
daniel-sintef wants to merge 11 commits into
ddmms:mainfrom
DanielMarchand:add-alzncumg-metallurgy-tests

Conversation

@daniel-sintef
Copy link
Copy Markdown

This pull request introduces a new "Alloy Metallurgy" benchmark suite, focused on multi-property regression tests for Al-Cu-Mg-Zn alloys, into the documentation, analysis, and app layers. It provides a new user guide section, metrics definitions, test helpers, app configuration, and data for this benchmark, and integrates it into the existing benchmarking infrastructure.

Alloy Metallurgy Benchmark Integration

Documentation and User Guide

  • Added a new user guide section, alloy_metallurgy.rst, describing the scope, data provenance, and initial implementation of the Al-Zn-Cu-Mg regression benchmark for metallic alloys. Also registered this benchmark in the main benchmarks index. [1] [2]

Benchmark Metrics and Data

  • Introduced metrics.yml for the new benchmark, specifying error metrics (formation energy, volume, lattice constants, beta angle, solute-solute binding, and elastic properties) with thresholds, units, and tooltips.
  • Added a metrics results table in JSON format for the new benchmark, including metric definitions, tooltips, thresholds, and sample results.

App and Analysis Code

  • Implemented a new Dash app (app_alzncumg_regression.py) for visualizing and interacting with the Al-Zn-Cu-Mg regression benchmark, including plot and structure display, and dynamic callback registration based on available metrics.
  • Added a YAML configuration for the new benchmark in the app registry.

Testing and Infrastructure

  • Added comprehensive test helpers and unit tests for the analysis of the new benchmark, covering data loading, metric calculation, and optional metrics/plots.
  • Updated the model selection logic in build_table_wrapper to use the current models list, ensuring correct model display for new benchmarks.


* Saal, Kirklin, Aykol, Meredig, and Wolverton, "Materials Design and Discovery with High-Throughput Density Functional Theory: The Open Quantum Materials Database (OQMD)", JOM 65, 1501-1509 (2013). doi:10.1007/s11837-013-0755-4
* Kirklin, Saal, Meredig, Thompson, Doak, Aykol, Ruhl, and Wolverton, "The Open Quantum Materials Database (OQMD): assessing the accuracy of DFT formation energies", npj Computational Materials 1, 15010 (2015). doi:10.1038/npjcompumats.2015.10

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add the required docs sections, see: e.g. https://ddmms.github.io/ml-peg/user_guide/benchmarks/surfaces.html

we're mainly missing metrics and computational cost

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok that sounds good thanks for taking a look! <sorry I forgot to put it into 'draft mode' first>

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no worries!

@daniel-sintef daniel-sintef marked this pull request as draft May 25, 2026 18:52
@joehart2001
Copy link
Copy Markdown
Collaborator

Sorry i forgot to press comment on this:
Thanks for the PR @daniel-sintef ! some initial things:

  • ive uploaded your data to the s3 bucket so youll be able to download it with:
    oqmd_path = (                                                                                                                
        download_s3_data(                                                                                                        
            key="inputs/alloy_metallurgy/alzncumg_regression/alzncumg_regression.zip",                                           
            filename="alzncumg_regression.zip",                                                                                  
        )                                                                                                                        
        / "alzncumg_regression"                                                                                                  
        / "structures"                                                                                                           
        / "OQMD-Dumps"                                                                                                           
    )    
  • i think you may have missed runnign the pre-commit, you'll need to mkae sure you've installed it and then you can run it
  • i dont think we need the changes to: decorators.py, cli.py, models.yml, .gitignore and uv.lock? if you require additional packages, then please update the pyproject.toml, otherwise we can leave out the uv.lock changes
  • thank you for providing outputs/ and data/, but i will run some tests myself so we can remove this from the PR. do you have a reccomendation for testing? e.g. parameters to alter just so i can run it quickly?

@ElliottKasoar ElliottKasoar added the new benchmark Proposals and suggestions for new benchmarks label May 26, 2026
…bindings

- Changed atomic relaxation algorithm from FIRE to BFGS to match evalpot's `relax_atoms_only`.
- Adjusted `max_index` in shell slicing to exactly mimic evalpot pair shell distances.
- Adjusted vacancy generation order so that shell atom evaluates vacancies correctly.
- Enabled compatibility with sorted evalpot legacy DFT dictionary keys in analysis.
- Install pre-commit and run uv run pre-commit install
- Add numpydoc-validation test-file exclusion to .pre-commit-config.yaml
- Expand 32 one-liner docstrings in calc_alzncumg_regression.py to full
  numpydoc format with Parameters and Returns sections
- Fix _plot_values nested function docstring in analyse_alzncumg_regression.py
- Fix C408 dict() -> dict literal in analyse_alzncumg_regression.py
- Add D103 docstrings to write_custom_* functions in analyse module
- All 7 pre-commit hooks now pass: fix-end-of-files, mixed-line-ending,
  trim-trailing-whitespace, check-json, ruff check, ruff-format,
  numpydoc-validation
Reverted changes to ml_peg/cli/cli.py and ml_peg/analysis/utils/decorators.py
per PR review feedback. These touched pre-existing code outside the benchmark
scope (--models propagation fix and build_table model filtering) and the new
alzncumg_regression benchmark does not require them.
Side-effect downgrade of janus-core (0.9.3 -> 0.9.1) introduced by running
uv sync with the mace extra locally during validation. No new packages are
required by this benchmark; mace-torch is already an optional extra in
pyproject.toml on main.
- outputs/mace-mp-small/ and outputs/mock/ (all *.json and *.xyz)
- ml_peg/app/data/alloy_metallurgy/ (Plotly JSONs, metrics table, *.xyz)

Trims the corresponding .gitignore exceptions for outputs/**
and app/data/alloy_metallurgy/.

Kept in the repo (required static inputs):
- data/references/DFT.json: authoritative DFT baseline for all metrics
- data/structures/OQMD-Dumps/: 8 VASP POSCARs + 8 OQMD metadata JSONs (CC-BY 4.0)
- data/structures/special/AIIDA_339739 and AIIDA_481617: GSF base cell POSCARs
- .gitignore exceptions for data/references/*.json and data/structures/OQMD-Dumps/*.json
@daniel-sintef
Copy link
Copy Markdown
Author

Hey can you also add the "AIIDA_339739" and "AIIDA_481617" structures? I need them for the theta & theta'' structures? Thanks!

- Add _data_root() + download_s3_data() wiring in calc and analyse modules
  so OQMD structures and DFT.json are fetched from S3 on first use
- Remove all committed data files from git tracking (git rm --cached):
  data/references/DFT.json, data/structures/OQMD-Dumps/ (18 files),
  data/structures/special/AIIDA_339739 and AIIDA_481617
  (files remain on disk; AIIDA POSCARs to be added to S3 zip by reviewer)
- Remove .gitignore exceptions for data/references/*.json and
  data/structures/OQMD-Dumps/*.json (no longer needed)
- Remove module-level DATA_PATH and SPECIAL_STRUCTURE_PATH constants
- Read special VASP structures from _data_root() / structures / special /
  matching the S3 zip layout:
    alzncumg_regression/structures/special/AIIDA_339739
    alzncumg_regression/structures/special/AIIDA_481617
- Add inline comment flagging the expected S3 zip path so the
  reviewer knows where to place the files in the zip
…urgy.rst

Replace bespoke prose with the four-section structure used by all other
benchmark pages:
- Summary: overview of Al-Cu-Mg-Zn alloy regression suite
- Metrics: 12 numbered metrics grouped into Bulk properties, Surface and
  fault energies, Elastic constants (slow), Solute-solute binding (very_slow)
- Computational cost: per-test cost estimate with pytest marker noted
- Data availability: OQMD structures, non-OQMD precipitate cells, special
  GSF base cells, DFT.json reference, and OQMD citations
@daniel-sintef
Copy link
Copy Markdown
Author

OK other than the "AIIDA_339739" and "AIIDA_481617", which should be accessed by the S3 zip once they are added to alzncumg_regression/structures/special/ in the zip (see my earlier comment). I think its ready for you to take another look. Tests take about 5 minutes to run if you use mace-small (I think).

@daniel-sintef daniel-sintef marked this pull request as ready for review June 1, 2026 07:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new benchmark Proposals and suggestions for new benchmarks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants