Skip to content

[WIP] ElementEmbeddings integration#442

Merged
KingaMas merged 10 commits intodevelopfrom
feat/io_element_embeddings
Feb 24, 2026
Merged

[WIP] ElementEmbeddings integration#442
KingaMas merged 10 commits intodevelopfrom
feat/io_element_embeddings

Conversation

@AntObi
Copy link
Copy Markdown
Collaborator

@AntObi AntObi commented Jul 31, 2025

ElementEmbeddings Integration

Description

  • Wrappers to ElementEmbeddings

Type of change

  • New feature (non-breaking change which adds functionality)

  • This change requires a documentation update

How Has This Been Tested?

(TODO)

  • Test A
  • Test B

Test Configuration:

  • Python version: 3.11
  • Operating System: macOS

Reviewers

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

Summary by CodeRabbit

Release Notes

  • New Features

    • Added element embeddings integration tutorial demonstrating feature vector generation for chemical compositions.
    • Introduced composition featuriser functionality for converting chemical formulas to numerical feature vectors.
    • Added SmactFilterOutputs enum enabling flexible formatting of chemical formula screening results.
    • Enabled SmactStructure instances to be used in sets and as dictionary keys via hashing.
  • Documentation

    • Updated README with additional resource badge.
  • Chores

    • Updated build tool and linting dependencies to latest versions.

@codecov
Copy link
Copy Markdown

codecov Bot commented Jul 31, 2025

Codecov Report

❌ Patch coverage is 93.33333% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.30%. Comparing base (6931fbf) to head (7929cc1).
⚠️ Report is 60 commits behind head on develop.

Files with missing lines Patch % Lines
smact/structure_prediction/structure.py 58.33% 5 Missing ⚠️
smact/io/elementembeddings.py 90.24% 4 Missing ⚠️
smact/structure_prediction/database.py 50.00% 3 Missing ⚠️
smact/utils/compat.py 87.50% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #442      +/-   ##
===========================================
+ Coverage    80.66%   81.30%   +0.64%     
===========================================
  Files           33       36       +3     
  Lines         2871     3055     +184     
===========================================
+ Hits          2316     2484     +168     
- Misses         555      571      +16     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jul 31, 2025

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Introduces element embeddings integration via new wrapper module, adds Python 3.11 StrEnum compatibility backport, enhances smact_filter with flexible output formatting, adds composition dictionary conversion utility, expands test coverage, and updates project dependencies and configurations.

Changes

Cohort / File(s) Summary
Element Embeddings Integration
docs/tutorials/element_embeddings_integration.ipynb, smact/io/__init__.py, smact/io/elementembeddings.py
New tutorial notebook and wrapper module for ElementEmbeddings library, exposing composition_featuriser and species_composition_featuriser with configurable embeddings and pooling statistics.
Screening Output Enhancements
smact/screening.py, smact/utils/composition.py
Added SmactFilterOutputs enum to smact_filter enabling flexible return formats (default tuples, formula strings, composition dictionaries); new composition_dict_maker utility for dictionary conversion.
Compatibility & Utilities
smact/utils/compat.py, smact/utils/crystal_space/generate_composition_with_smact.py
Python 3.11 StrEnum backport for older environments; oxidation_states_set default changed to "icsd24" with improved type hints; DataFrame variable renames for clarity.
Test Expansion
smact/tests/test_elementembeddings.py, smact/tests/test_utils.py, smact/tests/test_core.py
Comprehensive tests for element embeddings enums and delegation; expanded smact_filter output format tests; minor test refinements and import consolidation.
Structure Enhancements
smact/structure_prediction/structure.py, smact/structure_prediction/probability_models.py, smact/data_loader.py
Added \hash\() method to SmactStructure with coordinate rounding; tightened equality comparison with rtol=0; variable renames (df → lambda_df, magpie_df, valence_df).
Configuration & Dependencies
.github/dependabot.yml, .pre-commit-config.yaml, pyproject.toml, requirements.txt, requirements/requirements-py310.txt, README.md
Dependabot schedule changes (daily for GitHub Actions, weekly for pip); pre-commit tool version updates; added tornado dependency; updated font tools; platform condition normalisation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested labels

docs, feature, tests, python, dependencies

Poem

🐰 A rabbit's ode to embeddings fair,
Element features floating in the air,
Screening, filtering, vectors combined,
Composition secrets now aligned,
From magpie to xenonpy's glow,
Feature extraction steals the show!

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 68.29% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The description is incomplete. While it mentions the main feature and type of change, the 'How Has This Been Tested?' section contains only placeholders marked as TODO, with no actual test details provided. The description lacks specifics about what was tested and how. Complete the testing section with actual test details, specify which tests were run, provide reproduction steps, and detail the complete test configuration used to validate the changes.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title '[WIP] ElementEmbeddings integration' accurately describes the main change: adding wrappers for ElementEmbeddings integration. It is clear, specific, and directly related to the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/io_element_embeddings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
smact/utils/composition.py (1)

96-105: Fix type annotation inconsistency in docstring.

The function signature correctly shows tuple[str, int, int] | tuple[str, int] but the docstring only mentions tuple[str, int, int].

    Args:
-        smact_filter_output (tuple[str, int, int]): An item in the list returned from smact_filter
+        smact_filter_output (tuple[str, int, int] | tuple[str, int]): An item in the list returned from smact_filter

    Returns:
        composition_dict (dict[str, float]): An composition dictionary

The implementation correctly reuses comp_maker and follows the same pattern as formula_maker.

smact/screening.py (2)

337-338: Consider refactoring to reduce code duplication in match statements.

The match statements at lines 426-433 and 438-444 are nearly identical. Consider extracting the format conversion logic into a helper function.

+def _format_compositions(compositions, return_output):
+    """Helper function to format compositions based on return_output type."""
+    match return_output:
+        case SmactFilterOutputs.default:
+            return compositions
+        case SmactFilterOutputs.formula:
+            return [formula_maker(smact_filter_output=comp) for comp in compositions]
+        case SmactFilterOutputs.dict:
+            return [composition_dict_maker(smact_filter_output=comp) for comp in compositions]

    # Return list depending on whether we are interested in unique species combinations
    # or just unique element combinations.
    if species_unique:
-        match return_output:
-            case SmactFilterOutputs.default:
-                return compositions
-            case SmactFilterOutputs.formula:
-                return [formula_maker(smact_filter_output=comp) for comp in compositions]
-            case SmactFilterOutputs.dict:
-                return [composition_dict_maker(smact_filter_output=comp) for comp in compositions]
+        return _format_compositions(compositions, return_output)

    else:
        compositions = [(i[0], i[2]) for i in compositions]

        compositions = list(set(compositions))
-        match return_output:
-            case SmactFilterOutputs.default:
-                return compositions
-            case SmactFilterOutputs.formula:
-                return [formula_maker(smact_filter_output=comp) for comp in compositions]
-            case SmactFilterOutputs.dict:
-                return [composition_dict_maker(smact_filter_output=comp) for comp in compositions]
+        return _format_compositions(compositions, return_output)

354-354: Improve docstring formatting and add return type documentation.

The docstring should better document the different return types based on the return_output parameter.

-        return_output (SmactFilterOutputs): If set to 'default', the function will return a list of tuples containing the tuples of symbols, oxidation states and stoichiometry values. "Formula" returns a list of formulas and "dict" returns a list of dictionaries.
+        return_output (SmactFilterOutputs): Controls the output format:
+            - 'default': List of tuples containing symbols, oxidation states and stoichiometry values
+            - 'formula': List of chemical formula strings  
+            - 'dict': List of composition dictionaries

    Returns:
    -------
-        allowed_comps (list): Allowed compositions for that chemical system
-        in the form [(elements), (oxidation states), (ratios)] if species_unique=True and tuple=False
-        or in the form [(elements), (ratios)] if species_unique=False and tuple=False.
+        allowed_comps (list): Allowed compositions for that chemical system.
+        Return type depends on return_output parameter:
+        - SmactFilterOutputs.default: List[tuple] with format depending on species_unique
+        - SmactFilterOutputs.formula: List[str] of chemical formulas
+        - SmactFilterOutputs.dict: List[dict] of composition dictionaries
smact/io/elementembeddings.py (2)

17-31: Address the TODO comment about moving enums to ElementEmbeddings.

The comment suggests this enum should be moved to the ElementEmbeddings codebase. Consider whether this duplication is necessary or if the enums should be imported from ElementEmbeddings directly.

If these enums are temporary, consider adding a TODO comment with a timeline or GitHub issue reference. If they're permanent, remove the comment and ensure they stay in sync with ElementEmbeddings capabilities.


39-49: Consider the same enum consolidation for PoolingStats.

Similar to the previous enum, this also has a comment suggesting it should be moved to ElementEmbeddings codebase.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 515a369 and 403bcea.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • docs/tutorials/element_embeddings_integration.ipynb (1 hunks)
  • smact/io/__init__.py (1 hunks)
  • smact/io/elementembeddings.py (1 hunks)
  • smact/screening.py (5 hunks)
  • smact/utils/composition.py (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
smact/screening.py (1)
smact/structure_prediction/structure.py (1)
  • composition (575-595)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: test (3.12, macos-latest)
  • GitHub Check: test (3.13, ubuntu-latest)
  • GitHub Check: test (3.13, macos-latest)
🔇 Additional comments (6)
smact/io/__init__.py (1)

1-1: LGTM! Clean package initialisation.

The docstring appropriately describes the purpose of this package as an interface to external libraries.

docs/tutorials/element_embeddings_integration.ipynb (2)

732-753: Excellent tutorial demonstrating the integration workflow.

This cell effectively shows the complete workflow from element screening to feature vector generation. The imports are correct and the progression from smact_filter with the new SmactFilterOutputs.formula option to composition_featuriser demonstrates the practical value of the integration.


898-902: Good demonstration of the raw output format.

This second cell shows users what the actual feature vector arrays look like, which is valuable for understanding the output format.

smact/screening.py (1)

27-33: Well-designed enum for output format options.

The SmactFilterOutputs enum provides clear, descriptive options for the different return formats. The use of auto() ensures consistent string values.

smact/io/elementembeddings.py (2)

70-102: Well-documented wrapper function with clear parameter descriptions.

The species_composition_featuriser function provides good documentation and maintains consistency with the ElementEmbeddings interface. The parameter passing is straightforward and appropriate.


53-67: Verify ElementEmbeddings wrapper parameters

Please ensure that our composition_featuriser wrapper remains aligned with the upstream ElementEmbeddings API:

  • Confirm that ee_composition_featuriser in elementembeddings.composition actually expects parameters named data, formula_column, embedding, stats and inplace.
  • Verify that the allowed types (pd.DataFrame | pd.Series | CompositionalEmbedding | list for data; Embedding | AllowedElementEmbeddings for embedding; and PoolingStats | list[PoolingStats] for stats) match the upstream function’s signature.
  • Pin and document the minimum ElementEmbeddings version that our wrapper was tested against, so future releases don’t silently break this interface.
  • Add an integration test (or CI check) that exercises the wrapper end-to-end with a known ElementEmbeddings version to catch breaking changes early.

Comment thread smact/screening.py Outdated
@KingaMas
Copy link
Copy Markdown
Member

KingaMas commented Jan 5, 2026

@AntObi do you need any help increasing patch coverage?

@AntObi
Copy link
Copy Markdown
Collaborator Author

AntObi commented Jan 5, 2026

@AntObi do you need any help increasing patch coverage?

I've forgotten about this PR from the hackathon. I think the code here pretty much handles the ElementEmbeddings integration. The only additional thing required would be writing up unit tests.
If you're able to, it'd be much appreciated if you could write up some unit tests @KingaMas

KingaMas and others added 3 commits February 23, 2026 15:52
…ocks

- Extract _format_output() helper in screening.py to deduplicate match blocks
- Add 11 unit tests for smact.io.elementembeddings (enums, import check, featurisers)
- Add smact_filter return_output tests in test_utils.py
- Consolidate StrEnum backport into smact/utils/compat.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (5)
smact/io/elementembeddings.py (2)

15-23: _check_elementembeddings: the # noqa: F401 is unnecessary and the error message is clear.

Ruff reports the noqa directive on line 18 is unused (F401 rule is not enabled). The import is only for its side-effect (checking availability), so the noqa can be safely removed.

🧹 Suggested cleanup
-        import elementembeddings  # noqa: F401
+        import elementembeddings
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@smact/io/elementembeddings.py` around lines 15 - 23, Remove the unnecessary
"# noqa: F401" in the _check_elementembeddings function: the import of
elementembeddings is intentionally only to test availability, so delete the noqa
directive from the import line inside _check_elementembeddings to satisfy Ruff
and keep the clear ImportError message unchanged.

26-28: Track the TODO: "Should be moved to element embeddings codebase".

Lines 26 and 48 contain comments indicating these enums should live in the upstream ElementEmbeddings package. Consider creating a tracking issue so this doesn't get lost, especially since the PR is already WIP.

Would you like me to open an issue to track migrating AllowedElementEmbeddings and PoolingStats upstream?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@smact/io/elementembeddings.py` around lines 26 - 28, Create a tracking issue
in the repository to migrate the enums to the upstream ElementEmbeddings package
and reference it in the code: open an issue titled like "Migrate
AllowedElementEmbeddings and PoolingStats to ElementEmbeddings package"
describing both symbols (AllowedElementEmbeddings, PoolingStats), why they
should move, and assign labels/assignee; then add a TODO comment in the code
next to the existing comments for AllowedElementEmbeddings and PoolingStats that
includes the new issue number or URL so the migration task is discoverable and
not lost.
smact/screening.py (1)

449-452: Potential data loss when species_unique=False combined with return_output="dict".

When species_unique=False, compositions are de-duplicated into 2-tuples (symbols, ratio) — dropping oxidation states. The composition_dict_maker and formula_maker then receive these 2-tuples instead of 3-tuples.

This is fine as long as both utility functions handle 2-element tuples correctly (the tests in test_utils.py suggest they do), but users should be aware that return_output="dict" with species_unique=False will produce element-keyed dictionaries (e.g. {"Fe": 1, "O": 1}) rather than species-keyed ones (e.g. {"Fe2+": 1, "O2-": 1}). It might be worth noting this in the docstring for return_output.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@smact/screening.py` around lines 449 - 452, The code de-duplicates
compositions into 2-tuples when species_unique is False (compositions =
list({(i[0], i[2]) for i in compositions})), which strips oxidation states and
means downstream helpers (_format_output -> composition_dict_maker /
formula_maker) will produce element-keyed dictionaries when return_output="dict"
rather than species (oxidation-state)-keyed ones; update the docstring for the
function that exposes return_output to explicitly document this behavior
(mention species_unique=False yields element-keyed dicts like {"Fe":1,"O":1}
instead of {"Fe2+":1,"O2-":1}) so users are aware of the data loss of
oxidation-state labels.
smact/tests/test_elementembeddings.py (1)

78-84: Minor DRY opportunity: _setup_mocks is duplicated across two test classes.

TestCompositionFeaturiser._setup_mocks and TestSpeciesCompositionFeaturiser._setup_mocks are nearly identical, differing only in which attribute is set on mock_comp_mod and the return value string. Consider extracting a shared helper or base class to reduce duplication, though this is a minor nit for a test file.

Also applies to: 138-144

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@smact/tests/test_elementembeddings.py` around lines 78 - 84, Extract the
duplicated mock setup into a shared helper or base class to DRY the tests:
factor the logic in _setup_mocks used by TestCompositionFeaturiser and
TestSpeciesCompositionFeaturiser into a single helper (e.g., a module-level
function create_mocks(mock_attr_name, return_value) or a TestBase class with a
_setup_mocks that accepts which attribute to set), then have each test class
call that helper or inherit and pass
"composition_featuriser"/"species_composition_featuriser" and the respective
return strings ("featurised_result"/"species_featurised_result") so only the
differing attribute name and return value remain configurable while reusing the
shared MagicMock creation.
smact/tests/test_utils.py (1)

172-176: Consider testing with a valid string that matches an enum value.

The invalid-input test is good. You might also want a test that passes a plain string like "formula" (not the enum member) to confirm the StrEnum equality semantics work end-to-end — this would validate that users can pass raw strings as a convenience.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@smact/tests/test_utils.py` around lines 172 - 176, Add a positive test that
verifies smact_filter accepts a plain string matching the StrEnum value (e.g.,
"formula") and behaves the same as passing the enum member; specifically, create
a new test (alongside test_smact_filter_invalid_return_output) that calls
smact_filter(els, threshold=2, return_output="formula") and asserts the returned
output equals the result when calling smact_filter(els, threshold=2,
return_output=ReturnOutput.FORMULA) (or otherwise checks the same output
shape/value), referencing the smact_filter function and the ReturnOutput enum to
validate string-to-enum equality handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@smact/io/elementembeddings.py`:
- Around line 15-23: Remove the unnecessary "# noqa: F401" in the
_check_elementembeddings function: the import of elementembeddings is
intentionally only to test availability, so delete the noqa directive from the
import line inside _check_elementembeddings to satisfy Ruff and keep the clear
ImportError message unchanged.
- Around line 26-28: Create a tracking issue in the repository to migrate the
enums to the upstream ElementEmbeddings package and reference it in the code:
open an issue titled like "Migrate AllowedElementEmbeddings and PoolingStats to
ElementEmbeddings package" describing both symbols (AllowedElementEmbeddings,
PoolingStats), why they should move, and assign labels/assignee; then add a TODO
comment in the code next to the existing comments for AllowedElementEmbeddings
and PoolingStats that includes the new issue number or URL so the migration task
is discoverable and not lost.

In `@smact/screening.py`:
- Around line 449-452: The code de-duplicates compositions into 2-tuples when
species_unique is False (compositions = list({(i[0], i[2]) for i in
compositions})), which strips oxidation states and means downstream helpers
(_format_output -> composition_dict_maker / formula_maker) will produce
element-keyed dictionaries when return_output="dict" rather than species
(oxidation-state)-keyed ones; update the docstring for the function that exposes
return_output to explicitly document this behavior (mention species_unique=False
yields element-keyed dicts like {"Fe":1,"O":1} instead of {"Fe2+":1,"O2-":1}) so
users are aware of the data loss of oxidation-state labels.

In `@smact/tests/test_elementembeddings.py`:
- Around line 78-84: Extract the duplicated mock setup into a shared helper or
base class to DRY the tests: factor the logic in _setup_mocks used by
TestCompositionFeaturiser and TestSpeciesCompositionFeaturiser into a single
helper (e.g., a module-level function create_mocks(mock_attr_name, return_value)
or a TestBase class with a _setup_mocks that accepts which attribute to set),
then have each test class call that helper or inherit and pass
"composition_featuriser"/"species_composition_featuriser" and the respective
return strings ("featurised_result"/"species_featurised_result") so only the
differing attribute name and return value remain configurable while reusing the
shared MagicMock creation.

In `@smact/tests/test_utils.py`:
- Around line 172-176: Add a positive test that verifies smact_filter accepts a
plain string matching the StrEnum value (e.g., "formula") and behaves the same
as passing the enum member; specifically, create a new test (alongside
test_smact_filter_invalid_return_output) that calls smact_filter(els,
threshold=2, return_output="formula") and asserts the returned output equals the
result when calling smact_filter(els, threshold=2,
return_output=ReturnOutput.FORMULA) (or otherwise checks the same output
shape/value), referencing the smact_filter function and the ReturnOutput enum to
validate string-to-enum equality handling.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a34e4e8 and 2570fb5.

📒 Files selected for processing (6)
  • smact/io/elementembeddings.py
  • smact/screening.py
  • smact/tests/test_elementembeddings.py
  • smact/tests/test_utils.py
  • smact/utils/compat.py
  • smact/utils/composition.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • smact/utils/composition.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

> [!CAUTION]

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
smact/utils/crystal_space/generate_composition_with_smact.py (1)

154-154: ⚠️ Potential issue | 🔴 Critical

generate_composition_with_smact_custom() will crash at runtime when called with default arguments.

The type annotation change to oxidation_states_set: str | None = None (line 154) is syntactically correct but breaks downstream calls:

  • Line 209: lookup_element_oxidation_states_custom("all", oxidation_states_set, copy=False) requires filepath to be a string pointing to an oxidation states file. When oxidation_states_set=None, this parameter receives no default value and is not guarded—_get_data_rows(None) will raise an error.
  • Line 226: smact_filter(..., oxidation_states_set=oxidation_states_set) expects oxidation_states_set: str (preset name like "icsd24" or file path). Passing None violates the type contract and will fail downstream when the function expects a string.

Either:

  1. Change the default back to a valid string (e.g. str = "icsd24"), or
  2. Update both downstream functions to handle None by using a sensible fallback preset, or
  3. Change line 154 type annotation to require the argument and remove the default.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@smact/utils/crystal_space/generate_composition_with_smact.py` at line 154,
The parameter oxidation_states_set in generate_composition_with_smact_custom
currently defaults to None which causes downstream calls to
lookup_element_oxidation_states_custom and smact_filter to receive None and
crash; revert the default to a valid preset string and strict str type by
changing the signature to oxidation_states_set: str = "icsd24" (or another
appropriate preset), so lookup_element_oxidation_states_custom("all",
oxidation_states_set, copy=False) and smact_filter(...,
oxidation_states_set=oxidation_states_set) always receive a valid string.
smact/screening.py (1)

548-549: ⚠️ Potential issue | 🟡 Minor

Dead branch condition — or oxidation_states_set is None is unreachable.

By the time execution reaches line 548, oxidation_states_set is None has already been handled and returned from the if block starting at line 529. The or oxidation_states_set is None sub-expression can never be True here.

🐛 Proposed fix
-        elif oxidation_states_set == "icsd24" or oxidation_states_set is None:
+        elif oxidation_states_set == "icsd24":
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@smact/screening.py` around lines 548 - 549, The condition "or
oxidation_states_set is None" in the elif is unreachable because None is already
handled earlier; remove the redundant sub-expression so the branch reads "elif
oxidation_states_set == 'icsd24':" and keep the body assigning ox_combos =
[el.oxidation_states_icsd24 for el in smact_elems] (no other logic changes to
oxidation_states_set, ox_combos, or smact_elems).
🧹 Nitpick comments (3)
.github/dependabot.yml (1)

15-19: Consider weekly instead of daily for pip updates.

daily pip polling is unusually aggressive for a research Python library whose dependencies don't change on that cadence. Combined with a per-ecosystem limit of 5 open PRs, this could generate up to 5 simultaneous dependency-bump PRs every day, adding meaningful review overhead.

weekly is the Dependabot default and suits most projects well. If staying with daily, adding a groups: block to batch compatible updates into a single PR would significantly reduce noise.

⚙️ Suggested configuration with grouping and a less aggressive schedule
  - package-ecosystem: "pip"
    directory: "/"
    schedule:
-     interval: daily
+     interval: weekly
    open-pull-requests-limit: 5
    target-branch: develop
+   groups:
+     python-dependencies:
+       patterns:
+         - "*"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/dependabot.yml around lines 15 - 19, Update the Dependabot pip
config to reduce noise: change the schedule block under the package-ecosystem:
"pip" entry so interval: daily is replaced with interval: weekly (or, if you opt
to keep daily, add a groups: section to batch compatible updates into a single
PR); adjust or retain open-pull-requests-limit: 5 as needed after grouping.
Target the stanza that contains package-ecosystem: "pip", schedule: interval:
daily and open-pull-requests-limit: 5 and make the schedule change or add
grouping to limit daily PR churn.
.pre-commit-config.yaml (1)

7-16: Consider migrating the legacy ruff hook ID to ruff-check.

The upstream .pre-commit-hooks.yaml explicitly labels id: ruff as a "Legacy alias", with ruff-check being the canonical hook ID. The alias remains functional, but aligning with the canonical name future-proofs the config against eventual alias removal.

♻️ Proposed update
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.14.10
    hooks:
      # Run the linting tool
-     - id: ruff
+     - id: ruff-check
        types_or: [python, pyi]
        args: [--fix]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.pre-commit-config.yaml around lines 7 - 16, Replace the legacy hook ID
"ruff" with the canonical "ruff-check" in the hooks block so the linting hook
uses id: ruff-check (preserve existing types_or and args settings); leave the
ruff-format hook unchanged. Locate the hooks where "id: ruff" appears and update
it to "id: ruff-check" (ensuring the surrounding hook entries, e.g., types_or
and args, remain intact).
smact/screening.py (1)

28-33: SmactFilterOutputs.dict shadows the Python builtin dict.

The member name dict is scoped to the class namespace so it causes no runtime collision, but it makes the attribute name potentially confusing (e.g. type annotations, docs, IDE auto-complete). Consider renaming to composition_dict or as_dict.

♻️ Suggested rename
 class SmactFilterOutputs(StrEnum):
     """Allowed outputs of the `smact_filter` function."""

     default = "default"
     formula = "formula"
-    dict = "dict"
+    composition_dict = "dict"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@smact/screening.py` around lines 28 - 33, The enum member
SmactFilterOutputs.dict shadows the builtin name dict which can confuse type
hints and IDEs; rename this member (e.g., to SmactFilterOutputs.composition_dict
or SmactFilterOutputs.as_dict) and update all usages of SmactFilterOutputs.dict
(calls, comparisons, docs, and type annotations) to the new name so references
to the enum remain correct and clear. Ensure the enum string value is updated if
desired (or left as "dict" only if external compatibility requires it) and run
tests/imports to confirm no broken references remain.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pyproject.toml`:
- Line 78: Update the Tornado dependency lower bound from "tornado>=6.5" to
"tornado>=6.5.3" wherever it appears (the docs dependency entry and the
[dependency-groups] dev section) to ensure inclusion of the 6.5.3 security
fixes; locate the string "tornado>=6.5" in pyproject.toml and replace it with
"tornado>=6.5.3" in both places.

In `@smact/screening.py`:
- Line 379: The docstring for the parameter return_output refers to allowed
values with inconsistent capitalization ("Formula" vs the enum value "formula");
update the docstring so the listed values exactly match the SmactFilterOutputs
enum (e.g., use "default", "formula", "dict") to avoid confusion—locate the
return_output description in screening.py (the SmactFilterOutputs reference) and
change "Formula" to "formula".

In `@smact/structure_prediction/database.py`:
- Line 21: The module-level import of MPResterNew (from mp_api.client) must be
made optional to avoid hard failure when mp-api isn't installed; either wrap the
import in try/except (set MPResterNew = None on ImportError and raise a clear
error when the MP-API functionality is invoked) or move the import statement for
MPResterNew inside the function that actually uses it (the function that
currently calls the MP API client at the usage site), mirroring how pathos is
handled; update the runtime check to raise an informative message if MPResterNew
is unavailable when that function is called.

In `@smact/structure_prediction/structure.py`:
- Around line 131-159: The __hash__ and __eq__ mismatch is caused by np.allclose
using a nonzero rtol and by lax iteration using zip(..., strict=False); update
the equality logic in __eq__ to call np.allclose(..., atol=1e-7, rtol=0) so it
becomes an absolute-only tolerance matching the rounding in __hash__, and
tighten the site/spec iteration to enforce equal-length coordinate lists (e.g.,
use zip(..., strict=True) or explicitly compare lengths for each species key) so
that objects considered equal will produce the same hash computed in __hash__.
- Line 14: The top-level import of MPResterNew makes the except ImportError in
from_mp unreachable; move the import of MPResterNew (mp_api.client.MPRester)
into the from_mp function and mirror the optional-dependency pattern used for
pathos: attempt the import inside from_mp, catch ImportError, and then fall back
to the legacy MPRester path. Update references in from_mp to use the locally
imported MPResterNew and ensure the ImportError branch preserves the existing
fallback behavior.

In `@smact/utils/crystal_space/generate_composition_with_smact.py`:
- Line 138: The assignment results_df.loc[smact_allowed, "smact_allowed"] = True
can silently create new rows when smact_allowed contains labels not present in
results_df.index (derived from compounds); change it to only set rows that exist
by using results_df.loc[results_df.index.isin(smact_allowed), "smact_allowed"] =
True (i.e., use isin against results_df.index or compounds to filter
smact_allowed before assignment) to avoid inflating the DataFrame with
missing-index entries; apply the same guarded fix to the analogous assignment
later in the file (the other occurrence around the second mentioned location).

---

Outside diff comments:
In `@smact/screening.py`:
- Around line 548-549: The condition "or oxidation_states_set is None" in the
elif is unreachable because None is already handled earlier; remove the
redundant sub-expression so the branch reads "elif oxidation_states_set ==
'icsd24':" and keep the body assigning ox_combos = [el.oxidation_states_icsd24
for el in smact_elems] (no other logic changes to oxidation_states_set,
ox_combos, or smact_elems).

In `@smact/utils/crystal_space/generate_composition_with_smact.py`:
- Line 154: The parameter oxidation_states_set in
generate_composition_with_smact_custom currently defaults to None which causes
downstream calls to lookup_element_oxidation_states_custom and smact_filter to
receive None and crash; revert the default to a valid preset string and strict
str type by changing the signature to oxidation_states_set: str = "icsd24" (or
another appropriate preset), so lookup_element_oxidation_states_custom("all",
oxidation_states_set, copy=False) and smact_filter(...,
oxidation_states_set=oxidation_states_set) always receive a valid string.

---

Nitpick comments:
In @.github/dependabot.yml:
- Around line 15-19: Update the Dependabot pip config to reduce noise: change
the schedule block under the package-ecosystem: "pip" entry so interval: daily
is replaced with interval: weekly (or, if you opt to keep daily, add a groups:
section to batch compatible updates into a single PR); adjust or retain
open-pull-requests-limit: 5 as needed after grouping. Target the stanza that
contains package-ecosystem: "pip", schedule: interval: daily and
open-pull-requests-limit: 5 and make the schedule change or add grouping to
limit daily PR churn.

In @.pre-commit-config.yaml:
- Around line 7-16: Replace the legacy hook ID "ruff" with the canonical
"ruff-check" in the hooks block so the linting hook uses id: ruff-check
(preserve existing types_or and args settings); leave the ruff-format hook
unchanged. Locate the hooks where "id: ruff" appears and update it to "id:
ruff-check" (ensuring the surrounding hook entries, e.g., types_or and args,
remain intact).

In `@smact/screening.py`:
- Around line 28-33: The enum member SmactFilterOutputs.dict shadows the builtin
name dict which can confuse type hints and IDEs; rename this member (e.g., to
SmactFilterOutputs.composition_dict or SmactFilterOutputs.as_dict) and update
all usages of SmactFilterOutputs.dict (calls, comparisons, docs, and type
annotations) to the new name so references to the enum remain correct and clear.
Ensure the enum string value is updated if desired (or left as "dict" only if
external compatibility requires it) and run tests/imports to confirm no broken
references remain.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2570fb5 and 6630d7c.

📒 Files selected for processing (15)
  • .github/dependabot.yml
  • .pre-commit-config.yaml
  • README.md
  • pyproject.toml
  • requirements.txt
  • requirements/requirements-py310.txt
  • smact/data_loader.py
  • smact/screening.py
  • smact/structure_prediction/database.py
  • smact/structure_prediction/probability_models.py
  • smact/structure_prediction/structure.py
  • smact/tests/test_core.py
  • smact/tests/test_elementembeddings.py
  • smact/tests/test_utils.py
  • smact/utils/crystal_space/generate_composition_with_smact.py
✅ Files skipped from review due to trivial changes (2)
  • smact/data_loader.py
  • smact/structure_prediction/probability_models.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • smact/tests/test_elementembeddings.py

Comment thread pyproject.toml Outdated
Comment thread smact/screening.py Outdated
Comment thread smact/structure_prediction/database.py Outdated
Comment thread smact/structure_prediction/structure.py Outdated
Comment thread smact/structure_prediction/structure.py
df = pd.DataFrame(data=False, index=compounds, columns=["smact_allowed"])
df.loc[smact_allowed, "smact_allowed"] = True
results_df = pd.DataFrame(data=False, index=compounds, columns=["smact_allowed"])
results_df.loc[smact_allowed, "smact_allowed"] = True
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Silent DataFrame row expansion if a formula in smact_allowed is absent from compounds.

pd.DataFrame.loc silently inserts a new row when a label is not present in the index. In the normal execution path smact_allowed is a strict subset of compounds, but there is no guard enforcing this. Any reduced-formula mismatch (e.g., from a Pauling-threshold element producing a compound whose GCD-reduced form was never in the stoichiometry sweep) would inflate the DataFrame with rows that only carry smact_allowed=True but no False baseline entry, corrupting the output.

Using isin is explicit and zero-cost for correctness:

🛡️ Proposed fix
-    results_df = pd.DataFrame(data=False, index=compounds, columns=["smact_allowed"])
-    results_df.loc[smact_allowed, "smact_allowed"] = True
+    smact_allowed_set = set(smact_allowed)
+    results_df = pd.DataFrame(
+        {"smact_allowed": [c in smact_allowed_set for c in compounds]},
+        index=compounds,
+    )

Apply the same fix at line 249–250.

Also applies to: 250-250

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@smact/utils/crystal_space/generate_composition_with_smact.py` at line 138,
The assignment results_df.loc[smact_allowed, "smact_allowed"] = True can
silently create new rows when smact_allowed contains labels not present in
results_df.index (derived from compounds); change it to only set rows that exist
by using results_df.loc[results_df.index.isin(smact_allowed), "smact_allowed"] =
True (i.e., use isin against results_df.index or compounds to filter
smact_allowed before assignment) to avoid inflating the DataFrame with
missing-index entries; apply the same guarded fix to the analogous assignment
later in the file (the other occurrence around the second mentioned location).

Replace _check_elementembeddings() lazy imports with top-level try/except
and HAS_ELEMENTEMBEDDINGS flag to satisfy ruff (no noqa suppressions).
Update test mocking strategy to patch the flag and featuriser functions
directly. Use pytest.raises instead of self.assertRaises (PT027), keyword
arg new= for patch() booleans (FBT003), and combined with statements (SIM117).
- Rename SmactFilterOutputs.dict -> composition_dict to avoid shadowing Python's built-in dict
- Move mp_api lazy imports inside the branches that use them in structure_prediction/database.py and structure_prediction/structure.py
- Add rtol=0 to np.allclose in SmactStructure.__eq__ for stricter coordinate comparison
- Change oxidation_states_set default from None to "icsd24" in generate_composition_with_smact_custom
- Update test_utils.py to reflect composition_dict rename
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
smact/utils/crystal_space/generate_composition_with_smact.py (1)

154-166: ⚠️ Potential issue | 🔴 Critical

"icsd24" default will cause FileNotFoundError in lookup_element_oxidation_states_custom.

The function at line 209 calls lookup_element_oxidation_states_custom("all", oxidation_states_set, copy=False), which expects a file path as the second argument. The _get_data_rows() function it uses internally calls open(filename) directly. Passing "icsd24" (the current default) will raise a FileNotFoundError because it is not a valid file path.

Additionally, the docstring at line 166 claims the parameter is "the path to the oxidation states file" whilst the default is "icsd24" — a named set identifier, not a path — creating a contradictory contract.

Either provide a valid file path as the default or None with proper fallback logic, and ensure the docstring accurately reflects what the parameter accepts.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@smact/utils/crystal_space/generate_composition_with_smact.py` around lines
154 - 166, The default "icsd24" for oxidation_states_set will cause
FileNotFoundError because lookup_element_oxidation_states_custom expects a file
path (its helper _get_data_rows calls open(filename)); change the function
generate_composition_with_smact signature to use oxidation_states_set:
Optional[str] = None, update the docstring to state the parameter accepts either
a file path or None (which will use the built‑in "icsd24" dataset), and add
fallback logic before calling lookup_element_oxidation_states_custom (e.g. if
oxidation_states_set is None resolve the actual file path for the "icsd24"
resource or call the appropriate loader to obtain a real filename) so that
lookup_element_oxidation_states_custom("all", resolved_path, copy=False) always
receives a valid filesystem path; reference functions:
generate_composition_with_smact, lookup_element_oxidation_states_custom, and
_get_data_rows.
♻️ Duplicate comments (3)
smact/structure_prediction/structure.py (2)

113-117: ⚠️ Potential issue | 🟠 Major

__eq__ can still treat length-mismatched sites as equal, breaking hashing guarantees.

zip(..., strict=False) truncates, so extra coordinates in one structure won’t be checked. That can yield self == other while hash(self) != hash(other) because __hash__ includes all coordinates. Please enforce equal-length site lists (and species count) during comparison.

🔧 Proposed fix (enforce equal lengths)
-        for si, sj in zip(self.sites.values(), other.sites.values(), strict=False):
-            for ci, cj in zip(si, sj, strict=False):
+        for si, sj in zip(self.sites.values(), other.sites.values(), strict=True):
+            if len(si) != len(sj):
+                sites_equal = False
+                break
+            for ci, cj in zip(si, sj, strict=True):
                 if not np.allclose(ci, cj, atol=1e-7, rtol=0):
                     sites_equal = False
                     break
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@smact/structure_prediction/structure.py` around lines 113 - 117, The equality
method (__eq__) currently uses zip(..., strict=False) which truncates
unequal-length iterables and can mark structures equal even when site lists or
per-site coordinate lists differ; update __eq__ to first check that the number
of sites (len(self.sites) vs len(other.sites)) is equal and then for each
corresponding site (iterate by index or by ordered keys from self.sites) verify
that the length of the coordinate/species list for that site matches before
comparing coordinates with np.allclose; if any length differs return False
immediately so __eq__ is consistent with __hash__ which assumes all coordinates
are accounted for.

130-158: ⚠️ Potential issue | 🟠 Major

Hash rounding still allows __eq__-equal coords to hash differently.

__eq__ allows differences up to 1e-7, but __hash__ rounds to 7 decimals. Two coords that differ by, say, 9e-8 can be == yet round to different values, violating the hash contract. Canonicalise coordinates the same way in both __eq__ and __hash__ (e.g., compare rounded coords in __eq__, or drop tolerance and use exact equality).

🔧 One consistent approach (compare rounded coords in __eq__)
-            for ci, cj in zip(si, sj, strict=True):
-                if not np.allclose(ci, cj, atol=1e-7, rtol=0):
+            for ci, cj in zip(si, sj, strict=True):
+                if tuple(round(c, 7) for c in ci) != tuple(round(c, 7) for c in cj):
                     sites_equal = False
                     break
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@smact/structure_prediction/structure.py` around lines 130 - 158, The current
__hash__ rounds coordinates to 7 decimals which can still map __eq__-equal
coordinates to different hashes; fix by quantizing coordinates the same way the
equality tolerance is applied (use integer quantization at the 1e-7 resolution)
so values within atol=1e-7 collapse to the same representation. Update the
__hash__ implementation (function __hash__) to convert each coordinate component
c to an integer key like int(round(c * 1e7)) (apply same transformation for
lattice_mat rows and any site coordinates), then build species_t, lattice_t
(from quantized lattice), and sites_t from these integer tuples before hashing
(keep lattice_param unchanged); this ensures hash consistency with __eq__ that
uses np.allclose(atol=1e-7).
smact/utils/crystal_space/generate_composition_with_smact.py (1)

137-138: Unresolved: silent DataFrame row expansion via loc assignment.

The results_df.loc[smact_allowed, "smact_allowed"] = True assignment at both lines 138 and 250 still silently inserts rows when any formula in smact_allowed is absent from results_df.index. The issue and proposed fix were raised in a prior review and remain unaddressed in this revision.

Also applies to: 249-250

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@smact/utils/crystal_space/generate_composition_with_smact.py` around lines
137 - 138, The assignment results_df.loc[smact_allowed, "smact_allowed"] = True
can silently add rows if any formulas in the smact_allowed list are not present
in results_df.index; instead restrict the assignment to the intersection of
existing index values (or use an index boolean mask) so only existing rows are
set. Locate the two occurrences that perform this write (the results_df variable
and the smact_allowed identifier around the assignment) and change them to
compute the valid targets first (e.g., valid =
results_df.index.intersection(smact_allowed) or mask =
results_df.index.isin(smact_allowed)) and then assign True only to those valid
rows to avoid row expansion. Ensure both occurrences (the one near the top and
the one near line ~250) are updated consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@smact/utils/crystal_space/generate_composition_with_smact.py`:
- Around line 154-166: The default "icsd24" for oxidation_states_set will cause
FileNotFoundError because lookup_element_oxidation_states_custom expects a file
path (its helper _get_data_rows calls open(filename)); change the function
generate_composition_with_smact signature to use oxidation_states_set:
Optional[str] = None, update the docstring to state the parameter accepts either
a file path or None (which will use the built‑in "icsd24" dataset), and add
fallback logic before calling lookup_element_oxidation_states_custom (e.g. if
oxidation_states_set is None resolve the actual file path for the "icsd24"
resource or call the appropriate loader to obtain a real filename) so that
lookup_element_oxidation_states_custom("all", resolved_path, copy=False) always
receives a valid filesystem path; reference functions:
generate_composition_with_smact, lookup_element_oxidation_states_custom, and
_get_data_rows.

---

Duplicate comments:
In `@smact/structure_prediction/structure.py`:
- Around line 113-117: The equality method (__eq__) currently uses zip(...,
strict=False) which truncates unequal-length iterables and can mark structures
equal even when site lists or per-site coordinate lists differ; update __eq__ to
first check that the number of sites (len(self.sites) vs len(other.sites)) is
equal and then for each corresponding site (iterate by index or by ordered keys
from self.sites) verify that the length of the coordinate/species list for that
site matches before comparing coordinates with np.allclose; if any length
differs return False immediately so __eq__ is consistent with __hash__ which
assumes all coordinates are accounted for.
- Around line 130-158: The current __hash__ rounds coordinates to 7 decimals
which can still map __eq__-equal coordinates to different hashes; fix by
quantizing coordinates the same way the equality tolerance is applied (use
integer quantization at the 1e-7 resolution) so values within atol=1e-7 collapse
to the same representation. Update the __hash__ implementation (function
__hash__) to convert each coordinate component c to an integer key like
int(round(c * 1e7)) (apply same transformation for lattice_mat rows and any site
coordinates), then build species_t, lattice_t (from quantized lattice), and
sites_t from these integer tuples before hashing (keep lattice_param unchanged);
this ensures hash consistency with __eq__ that uses np.allclose(atol=1e-7).

In `@smact/utils/crystal_space/generate_composition_with_smact.py`:
- Around line 137-138: The assignment results_df.loc[smact_allowed,
"smact_allowed"] = True can silently add rows if any formulas in the
smact_allowed list are not present in results_df.index; instead restrict the
assignment to the intersection of existing index values (or use an index boolean
mask) so only existing rows are set. Locate the two occurrences that perform
this write (the results_df variable and the smact_allowed identifier around the
assignment) and change them to compute the valid targets first (e.g., valid =
results_df.index.intersection(smact_allowed) or mask =
results_df.index.isin(smact_allowed)) and then assign True only to those valid
rows to avoid row expansion. Ensure both occurrences (the one near the top and
the one near line ~250) are updated consistently.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6630d7c and 97f2f93.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (9)
  • .github/dependabot.yml
  • .pre-commit-config.yaml
  • pyproject.toml
  • smact/io/elementembeddings.py
  • smact/screening.py
  • smact/structure_prediction/structure.py
  • smact/tests/test_elementembeddings.py
  • smact/tests/test_utils.py
  • smact/utils/crystal_space/generate_composition_with_smact.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • .github/dependabot.yml
  • .pre-commit-config.yaml
  • pyproject.toml

KingaMas and others added 2 commits February 24, 2026 08:18
- Add top-level try/except HAS_MP_API flag in structure_prediction/database.py and structure_prediction/structure.py to satisfy ruff PLC0415 (imports must be at module level); replace inline `from mp_api.client import` with conditional HAS_MP_API checks
- Refactor else/if to elif in SmactStructure.from_mp (PLR5501)
- Remove PD901 from ruff ignore list in pyproject.toml — rule was removed upstream and was emitting a noisy warning on every hook run
@KingaMas KingaMas marked this pull request as ready for review February 24, 2026 08:46
@KingaMas KingaMas merged commit 539a6db into develop Feb 24, 2026
17 checks passed
@KingaMas KingaMas deleted the feat/io_element_embeddings branch February 24, 2026 08:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants