Re-arranged commits paired extension alignment PR#584
Draft
marcelm wants to merge 6 commits into
Draft
Conversation
ec7e101 to
12bf6ca
Compare
Merged
No more need of hash tables to keep alignments: Pairs are always made of unique chains, no more need to "chache" alignments. New statistic: counter of best alignments obtained from rescued alignments Is-new-baseline: yes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a re-worked version of PR #576 that is I hope a little bit easier to review.
PR #576 consist of four commits, but two of those commits don’t compile, so they can also not be reviewed separately. Overall, the diff has over 1000 added and 1000 deleted lines, which is not manageable. I have tried to reverse-engineer some of the individual changes. The diff is still large, but it’s slightly better than before. I’ve done this also to show how I think big changes like this should be split up.
You can see below which commits I split out. In addition, I identified the following changes so far:
NamPairtoPairedNams,nam_pairstopaired_nams,get_nam_pairstoget_paired_nams. I have omitted this change because it 1) just adds noise to the diff and 2) I don’t agree with it. The reason is thatNamPairaccurately describes that the structure holds two NAMs whereasPairedNamsdoesn’t do so. It could be misinterpreted as a list/vector of NAM pairs.pairing.rs. This is fine to do (and I we talked about this before), but must be done in a separate refactoring commit. To have a nicer diff view here in GitHub, it should be done in a separate PR. It is fine to temporarily have functions placed in the "wrong" module if necessary.Details::best_rescuedstatisticScoredAlignmentPairtoPairedAlignments. I have left this, but it has the same issue asPairedNams. It does not make it clear how many alignments there are exactly.Optionfor alignments inPairedAlignmentssplit_nams_by_orientation(_checked)rescuedattribute toAlignmentcompute_combined_scorefunction. Especially important to have this as a separate commit in order to be able to check whether the function is the same as before.make_unmapped_pairsignature (single[Details; 2]to twodetails1,details2parameters). I would have like to discuss this separately to the other changes because I currently do not agree with it. There may be a reason to changing it, but I find it nicer to work with a two-element array because that makes it easier to iterate over the two reads in a pair.New statistic: counter of best alignments obtained from rescued alignmentsfor now.pairing.rs.There are maybe more changes that should become individual commits, but that’s how far I’ve gotten.
It is not visible here on GitHub, but I ran
cargo teston each individual commit.To Do
@NicolasBuchin