Skip to content

Add likelihood ratio (LR) calculation endpoints for striation and impression marks#149

Merged
laurensWe merged 27 commits intomainfrom
feature/dummy_lr_system_read_in
Mar 10, 2026
Merged

Add likelihood ratio (LR) calculation endpoints for striation and impression marks#149
laurensWe merged 27 commits intomainfrom
feature/dummy_lr_system_read_in

Conversation

@laurensWe
Copy link
Member

@laurensWe laurensWe commented Feb 24, 2026

This PR implements the LR calculation pipeline for both striation (CCF-based) and impression (CMC-based) mark comparisons, replacing placeholder stubs with real logic.

Core changes:

  • conversion/likelihood_ratio.py (new): Loads LR systems from pickle, wraps lir library calls for striation (calculate_lr_striation) and impression (calculate_lr_impression), and provides a ReferenceData
    model holding KM/KNM scores and LLRs. Reference data is currently hardcoded dummy data pending lr_module_scratch integration.
  • conversion/utils.py: Adds ccf_score_to_logodds to transform CCF scores before feeding them to the striation LR system.
  • processors/controller.py: Implements process_lr_striation and process_lr_impression — full pipeline functions that load marks, compute LR, build result metadata, and save overview plots (CCF/CMC comparison
    overviews with histograms and LLR transformation curves).
  • processors/schemas.py: Tightens request schemas — renames mark_ref/mark_comp to mark_dir_ref/mark_dir_comp, flattens CalculateLR to include MetadataParameters directly, adds ImpressionLRParameters
    (JSON-serialisable mirror of ImpressionComparisonMetrics), adds validation (score ≤ n_cells, score range [-1, 1]).
  • processors/router.py: Wires up the LR endpoints to the new controller functions; removes TODO stubs; saves aligned marks to disk after score calculation.
  • conversion/plots/utils.py (new): build_results_metadata_striation and build_results_metadata_impression helpers that format result metadata dicts for plot display.
  • Tests: Replaces monolithic test_schemas.py with split conftest.py + test_schemas.py + test_router.py; adds integration-level coverage for LR endpoints.

@SimoneAriens SimoneAriens self-requested a review February 24, 2026 11:42
Copy link
Collaborator

@SimoneAriens SimoneAriens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments, some parts can be imported from lir or lr_module_scratch and some should be moved to conversion

@laurensWe laurensWe changed the title Implement get_lr_system and calculate_lr in processors Add likelihood ratio (LR) calculation endpoints for striation and impression marks Mar 10, 2026
Copy link
Collaborator

@SimoneAriens SimoneAriens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

almost there, mostly cosmetic changes

Copy link
Collaborator

@vergep vergep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • We need to be careful whether LR_system.apply needs score or transformed_score.
    For the rest minor comments.

@github-actions
Copy link

Diff Coverage

Diff: origin/main..HEAD, staged and unstaged changes

  • packages/scratch-core/src/conversion/likelihood_ratio.py (92.9%): Missing lines 41,43,60
  • packages/scratch-core/src/conversion/plots/data_formats.py (100%)
  • packages/scratch-core/src/conversion/plots/utils.py (83.3%): Missing lines 398,403-404
  • packages/scratch-core/src/conversion/utils.py (100%)
  • src/extractors/schemas.py (100%)
  • src/processors/controller.py (100%)
  • src/processors/router.py (94.7%): Missing lines 67
  • src/processors/schemas.py (100%)

Summary

  • Total: 160 lines
  • Missing: 7 lines
  • Coverage: 95%

packages/scratch-core/src/conversion/likelihood_ratio.py

Lines 37-47

  37 
  38     @model_validator(mode="after")
  39     def _validate_matching_lengths(self) -> Self:
  40         if len(self.km_scores) != len(self.km_llrs):
! 41             raise ValueError("km_scores and km_lrs must have the same length")
  42         if len(self.knm_scores) != len(self.knm_llrs):
! 43             raise ValueError("knm_scores and knm_lrs must have the same length")
  44         return self
  45 
  46     @property
  47     def scores(self) -> np.ndarray:

Lines 56-64

  56     @property
  57     def llr_intervals(self) -> np.ndarray:
  58         """Concatenated KM and KNM LLR intervals, shape (n, 2)."""
  59         if self.km_llr_intervals is None or self.knm_llr_intervals is None:
! 60             raise ValueError("Only models with llr_intervals can be used")
  61         return np.concatenate([self.km_llr_intervals, self.knm_llr_intervals], axis=0)
  62 
  63     @property
  64     def labels(self) -> np.ndarray:

packages/scratch-core/src/conversion/plots/utils.py

Lines 394-408

  394 
  395 def _format_lr(llr_data: LLRData) -> str:
  396     """Format a single log-LR value with optional confidence interval."""
  397     if len(llr_data.llrs) > 1:
! 398         raise ValueError(f"expected single LR value, got {len(llr_data.llrs)}")
  399 
  400     log_lr = llr_data.llrs[0]
  401 
  402     if llr_data.llr_intervals is not None:
! 403         lower, upper = llr_data.llr_intervals[0, 0], llr_data.llr_intervals[0, 1]
! 404         return f"{log_lr:.2f} ({lower:.2f}, {upper:.2f})"
  405     return f"{log_lr:.2f}"
  406 
  407 
  408 def _common_results_metadata(

src/processors/router.py

Lines 63-71

  63     include_in_schema=False,
  64 )
  65 async def calculate_score_impression(impression: CalculateScore) -> ComparisonResponseImpression:
  66     """Compare two impression profiles."""
! 67     vault = create_vault(impression.tag)
  68     return ComparisonResponseImpression.generate_urls(vault.access_url)
  69 
  70 
  71 @processors.post(

@github-actions
Copy link

Code Coverage

Package Line Rate Branch Rate Health
. 96% 88%
comparators 100% 100%
computations 100% 100%
container_models 99% 100%
conversion 97% 82%
conversion.export 100% 100%
conversion.filter 97% 89%
conversion.leveling 100% 100%
conversion.leveling.solver 100% 75%
conversion.plots 98% 88%
conversion.preprocess_impression 99% 91%
conversion.preprocess_striation 89% 58%
conversion.profile_correlator 96% 82%
conversion.surface_comparison 93% 85%
extractors 99% 75%
mutations 100% 100%
parsers 98% 80%
parsers.patches 89% 60%
preprocessors 99% 89%
processors 99% 100%
renders 98% 67%
utils 92% 75%
Summary 97% (3058 / 3145) 84% (330 / 394)

Minimum allowed line rate is 50%

@laurensWe laurensWe merged commit 96e0e4b into main Mar 10, 2026
4 checks passed
@laurensWe laurensWe deleted the feature/dummy_lr_system_read_in branch March 10, 2026 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants