Skip to content

Re-arranged commits paired extension alignment PR#584

Draft
marcelm wants to merge 6 commits into
mainfrom
pairedext-mm
Draft

Re-arranged commits paired extension alignment PR#584
marcelm wants to merge 6 commits into
mainfrom
pairedext-mm

Conversation

@marcelm
Copy link
Copy Markdown
Collaborator

@marcelm marcelm commented May 5, 2026

This is a re-worked version of PR #576 that is I hope a little bit easier to review.

PR #576 consist of four commits, but two of those commits don’t compile, so they can also not be reviewed separately. Overall, the diff has over 1000 added and 1000 deleted lines, which is not manageable. I have tried to reverse-engineer some of the individual changes. The diff is still large, but it’s slightly better than before. I’ve done this also to show how I think big changes like this should be split up.

You can see below which commits I split out. In addition, I identified the following changes so far:

  • Rename NamPair to PairedNams, nam_pairs to paired_nams, get_nam_pairs to get_paired_nams. I have omitted this change because it 1) just adds noise to the diff and 2) I don’t agree with it. The reason is that NamPair accurately describes that the structure holds two NAMs whereas PairedNams doesn’t do so. It could be misinterpreted as a list/vector of NAM pairs.
  • Move paired-end related code into pairing.rs. This is fine to do (and I we talked about this before), but must be done in a separate refactoring commit. To have a nicer diff view here in GitHub, it should be done in a separate PR. It is fine to temporarily have functions placed in the "wrong" module if necessary.
  • Add Details::best_rescued statistic
  • Rename ScoredAlignmentPair to PairedAlignments. I have left this, but it has the same issue as PairedNams. It does not make it clear how many alignments there are exactly.
  • Do not use Option for alignments in PairedAlignments
  • Add more tests for split_nams_by_orientation(_checked)
  • Add rescued attribute to Alignment
  • Factor out compute_combined_score function. Especially important to have this as a separate commit in order to be able to check whether the function is the same as before.
  • Change make_unmapped_pair signature (single [Details; 2] to two details1, details2 parameters). I would have like to discuss this separately to the other changes because I currently do not agree with it. There may be a reason to changing it, but I find it nicer to work with a two-element array because that makes it easier to iterate over the two reads in a pair.
  • I’ve left out the commit New statistic: counter of best alignments obtained from rescued alignments for now.
  • Move paired-end related code into pairing.rs.

There are maybe more changes that should become individual commits, but that’s how far I’ve gotten.

It is not visible here on GitHub, but I ran cargo test on each individual commit.

To Do

  • Fix commit messages
  • Ensure commit author is Nicolas for all commits

@NicolasBuchin

@marcelm marcelm force-pushed the pairedext-mm branch 3 times, most recently from ec7e101 to 12bf6ca Compare May 8, 2026 10:42
@marcelm marcelm mentioned this pull request May 11, 2026
NicolasBuchin and others added 6 commits May 12, 2026 10:12
No more need of hash tables to keep alignments:
Pairs are always made of unique chains, no more need to "chache"
alignments.

New statistic: counter of best alignments obtained from rescued alignments

Is-new-baseline: yes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant