Skip to content

Correct digly handling with SMILES#386

Merged
GeorgWa merged 10 commits intomainfrom
add-GlyGly-N-term
Jan 5, 2026
Merged

Correct digly handling with SMILES#386
GeorgWa merged 10 commits intomainfrom
add-GlyGly-N-term

Conversation

@GeorgWa
Copy link
Copy Markdown
Collaborator

@GeorgWa GeorgWa commented Jan 4, 2026

Summary

  • Add smiles for al GG@X modifications
  • Remove redundancies

Add support for N-terminal ubiquitination (diglycine remnant on protein
N-terminus), a non-canonical modification catalyzed by UBE2W.

- Add GlyGly@Any_N-term to modification.tsv with SMILES
- Add msfragger modification mapping for mass-based recognition
- Add to mass_mapped_mods list for msfragger_psm_tsv reader

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@jalew188
Copy link
Copy Markdown
Collaborator

jalew188 commented Jan 4, 2026

What is the real mass decimal place (e.g. N-term(114.0429)) of MSFragger modification?

@GeorgWa
Copy link
Copy Markdown
Collaborator Author

GeorgWa commented Jan 4, 2026

As far as I know, MSFragger uses 4 decimal places without rounding (just cutoff).

@GeorgWa GeorgWa requested review from Copilot, jalew188, lucas-diedrich and mschwoer and removed request for lucas-diedrich January 4, 2026 10:28
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for N-terminal ubiquitination by introducing the GlyGly@Any_N-term modification. This represents a non-canonical modification catalyzed by UBE2W, which differs from the more common K-linked ubiquitination.

  • Adds the GlyGly@Any_N-term modification entry to the modification database
  • Configures MSFragger mass-based recognition via N-term(114.0429) mapping
  • Enables the modification for MSFragger PSM TSV reader

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
alphabase/constants/const_files/modification.tsv Adds the new GlyGly@Any_N-term modification entry with SMILES structure and chemical composition
alphabase/constants/const_files/psm_reader.yaml Adds MSFragger mapping and includes the modification in the mass_mapped_mods list

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread alphabase/constants/const_files/modification.tsv Outdated
GeorgWa and others added 6 commits January 4, 2026 11:42
Mass 34.068961 should be truncated to 34.0689, not rounded to 34.0690.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Add GG@Any_N-term to modification.tsv
- Remove duplicate GlyGly@K and GlyGly@Any_N-term entries
- Update psm_reader.yaml to use GG@ instead of GlyGly@
- Update sage_reader notebook test

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Use same SMILES pattern as GG@Any_N-term with [Ts] placeholder.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@GeorgWa GeorgWa changed the title feat: add GlyGly@Any_N-term for N-terminal ubiquitination Correct digly handling with SMILES Jan 4, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +268 to +269
GG@Protein_N-term 114.042927 114.1026 H(6)C(4)N(2)O(2) 0.0 Post-translational 121 NCC(=O)NCC(=O)[Ts] 0.0
GG@Any_N-term 114.042927 114.1026 H(6)C(4)N(2)O(2) 0.0 Multiple 121 NCC(=O)NCC(=O)[Ts] 0.0
Copy link

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SMILES strings for GG@Protein_N-term and GG@Any_N-term are identical (NCC(=O)NCC(=O)[Ts]), but they have different classification categories (Post-translational vs Multiple). Consider documenting why these modifications have the same chemical structure but different categories, or if they should have distinct SMILES to reflect their different biological contexts.

Copilot uses AI. Check for mistakes.
Update references to use consistent GG@K naming after
removing GlyGly@K duplicate from modification.tsv.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rerun the notebook? Otherwise the respective outputs will not be shown in the readthedocs documentation

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rerun this notebook as well?

Base automatically changed from add-NEM to main January 5, 2026 22:39
GeorgWa and others added 2 commits January 5, 2026 23:40
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@GeorgWa GeorgWa merged commit 27e4829 into main Jan 5, 2026
3 checks passed
@GeorgWa GeorgWa deleted the add-GlyGly-N-term branch January 5, 2026 23:10
@mschwoer
Copy link
Copy Markdown
Contributor

mschwoer commented Jan 7, 2026

As far as I know, MSFragger uses 4 decimal places without rounding (just cutoff).

could this be documented explicitly somewhere, e.g. in the yaml file @GeorgWa ?

@GeorgWa
Copy link
Copy Markdown
Collaborator Author

GeorgWa commented Jan 7, 2026

could this be documented explicitly somewhere, e.g. in the yaml file @GeorgWa ?

Done! #391

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants