Skip to content

Conversation

@acquayefrank
Copy link
Contributor

Describe your changes

Added functionality for converting from inchi to formula and from smiles to formula. This is based on this issue #161. I also added support to treat tabular data as TSV files based on this issue #160

Checklist before requesting a review

  • I have performed a self-review of my code

Note:

I bumped the version to 0.5.0 instead of 0.4.2 because recent changes, like adding inchi_to_formula and smiles_to_formula, introduced new features rather than bug fixes.

Per [Semantic Versioning](https://semver.org/):

  • Patch version (e.g., 0.4.2) is for bug fixes and backwards-compatible changes.
  • Minor version (e.g., 0.5.0) is for adding new, backwards-compatible features.

Since this PR includes a new feature, increasing the minor version is appropriate.

@acquayefrank acquayefrank requested a review from Copilot June 9, 2025 12:49
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces functionality for converting from InChI and SMILES to molecular formulas using RDKit and adds support for handling tabular files (TSV) in addition to CSV and XLSX.

  • Added new conversion methods (smiles_to_formula and inchi_to_formula) in the RDKit converter.
  • Updated tests to include the new conversion methods.
  • Modified file format handling in DataFrame, Spectra, and app modules, and updated dependencies/version information.

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/test_rdkit.py Added tests for the new smiles_to_formula and inchi_to_formula methods.
pyproject.toml Updated version and dependencies, and switched to Ruff linting.
MSMetaEnhancer/libs/utils/Errors.py Renamed UnknownSpectraFormat to UnknownFileFormat.
MSMetaEnhancer/libs/data/Spectra.py Updated exception usage to UnknownFileFormat.
MSMetaEnhancer/libs/data/DataFrame.py Added support for 'tabular' file format alongside tsv.
MSMetaEnhancer/libs/converters/compute/RDKit.py Added new RDKit functions for formula conversion from SMILES/InChI.
MSMetaEnhancer/app.py Updated data loading file format list and exception usage.
CHANGELOG.md Documented new features and dependency updates.
.pre-commit-config.yaml Added new pre-commit hooks and configured Python 3.12 environment.
Comments suppressed due to low confidence (3)

MSMetaEnhancer/libs/converters/compute/RDKit.py:80

  • Consider adding unit tests for invalid SMILES input to verify that the conversion correctly returns an empty formula when mol is None.
mol = MolFromSmiles(smiles)

MSMetaEnhancer/libs/converters/compute/RDKit.py:95

  • Consider adding unit tests for invalid InChI input to ensure consistent behavior when the molecule conversion fails, returning an empty formula.
mol = MolFromInchi(inchi)

MSMetaEnhancer/libs/data/DataFrame.py:15

  • Update the docstring to explicitly mention that 'tabular' is accepted as an alias for 'tsv' to ensure clarity for future maintainers.
Supported formats: csv, tsv/tabular, xlsx

Copy link
Member

@hechth hechth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a few comments :)

[tool.poetry.dependencies]
python = ">=3.10,<3.13"
matchms = ">=0.28.2"
matchms = ">=0.30.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it not work with older versions of matchms? I'd be cautious of the matchms 0.30 pin as this means forcing numpy 2 - or does it not work with 0.28.2 anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, as far as tests are concerned, they work with 0.30.0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do they also work on 0.28.2?

@hechth hechth merged commit 5eac8c9 into RECETOX:main Jun 25, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add RDKit conversion functions Add tabular as alias for tsv to supported datatypes for loading and writing.

2 participants