Skip to content

Conversation

@avivko
Copy link
Collaborator

@avivko avivko commented Nov 7, 2025

When HMMER outputs alignment blocks with all gaps (no sequence aligned), it uses '-' for sequence positions instead of numbers. This caused a ValueError when trying to convert '-' to int.

Changes:

  • Handle '-' values in ali_left and ali_right by converting to None
  • Downstream code correctly handles gaps (they're skipped in position tracking)

Fixes parsing errors like:
ValueError: invalid literal for int() with base 10: '-'

When HMMER outputs alignment blocks with all gaps (no sequence aligned),
it uses '-' for sequence positions instead of numbers. This caused a
ValueError when trying to convert '-' to int.

Changes:
- Handle '-' values in ali_left and ali_right by converting to None
- None is semantically clearer than -1 for missing/invalid values
- Gap-only blocks are preserved to maintain alignment structure
- Downstream code correctly handles gaps (they're skipped in position tracking)

Fixes parsing errors like:
  ValueError: invalid literal for int() with base 10: '-'
@avivko avivko mentioned this pull request Nov 7, 2025
@alephreish
Copy link
Member

That's not a nice piece of code I have there in extract_hmmsearch.py...

Can you pls give an example sequence that demonstrates this behavior?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants