Skip to content

Return reference distribution data from LR endpoints#174

Open
laurensWe wants to merge 27 commits intomainfrom
feature/extend_result_lr_endpoint
Open

Return reference distribution data from LR endpoints#174
laurensWe wants to merge 27 commits intomainfrom
feature/extend_result_lr_endpoint

Conversation

@laurensWe
Copy link
Member

@laurensWe laurensWe commented Mar 10, 2026

Extends the LR endpoints to return the full reference distribution data alongside the log-LR value. Previously both endpoints returned only lr; they now also include KM/KNM scores, LLR values, and confidence intervals.

  • Adds LRResult and LRStriationResult frozen dataclasses in controller.py to carry all output from process_lr_impression / process_lr_striation (previously returned float)
  • Adds _extract_intervals helper to safely unpack llr_intervals (handles None)
  • Extends LRResponse with km_scores, knm_scores, km_llr, knm_llr, and per-group CI fields; flat serializer updated to include the new fields
  • Adds LRStriationResponse(LRResponse) with km_scores_transformed / knm_scores_transformed for the log-odds–transformed striation scores; calculate_lr_striation now declares this as its return type
  • Updates assert_lr_response_valid to assert all new list fields are present in integration test responses
  • Adjusts TestLRResponse fixtures and the _CONCRETE_CLASSES exclusion list to account for the schema change

@laurensWe laurensWe changed the title Feature/extend result lr endpoint Return reference distribution data from LR endpoints Mar 10, 2026
@laurensWe laurensWe marked this pull request as ready for review March 10, 2026 12:55
@github-actions
Copy link

Diff Coverage

Diff: origin/main..HEAD, staged and unstaged changes

  • packages/scratch-core/src/conversion/likelihood_ratio.py (93.2%): Missing lines 24,26,43
  • packages/scratch-core/src/conversion/plots/data_formats.py (100%)
  • packages/scratch-core/src/conversion/plots/utils.py (83.3%): Missing lines 398,403-404
  • packages/scratch-core/src/conversion/utils.py (100%)
  • src/extractors/schemas.py (100%)
  • src/processors/controller.py (98.2%): Missing lines 51
  • src/processors/router.py (94.7%): Missing lines 68
  • src/processors/schemas.py (100%)

Summary

  • Total: 190 lines
  • Missing: 8 lines
  • Coverage: 95%

packages/scratch-core/src/conversion/likelihood_ratio.py

  20 
  21     @model_validator(mode="after")
  22     def _validate_matching_lengths(self) -> Self:
  23         if len(self.km_scores) != len(self.km_llr_data.llrs):
! 24             raise ValueError("km_scores and km_lrs must have the same length")
  25         if len(self.knm_scores) != len(self.knm_llr_data.llrs):
! 26             raise ValueError("knm_scores and knm_lrs must have the same length")
  27         return self
  28 
  29     @property
  30     def scores(self) -> np.ndarray:

  39         """Concatenated KM and KNM LLR intervals, shape (n, 2)."""
  40         km = self.km_llr_data.llr_intervals
  41         knm = self.knm_llr_data.llr_intervals
  42         if km is None or knm is None:
! 43             raise ValueError("Only models with llr_intervals can be used")
  44         return np.concatenate([km, knm], axis=0)
  45 
  46     @property
  47     def labels(self) -> np.ndarray:

packages/scratch-core/src/conversion/plots/utils.py

  394 
  395 def _format_lr(llr_data: LLRData) -> str:
  396     """Format a single log-LR value with optional confidence interval."""
  397     if len(llr_data.llrs) != 1:
! 398         raise ValueError(f"expected single LR value, got {len(llr_data.llrs)}")
  399 
  400     log_lr = llr_data.llrs[0]
  401 
  402     if llr_data.llr_intervals is not None:
! 403         lower, upper = llr_data.llr_intervals[0][0], llr_data.llr_intervals[0][1]
! 404         return f"{log_lr:.2f} ({lower:.2f}, {upper:.2f})"
  405     return f"{log_lr:.2f}"
  406 
  407 
  408 def _common_results_metadata(

src/processors/controller.py

  47 
  48 
  49 def _extract_intervals(llr_data: LLRData) -> tuple[list[float], list[float]]:
  50     if llr_data.llr_intervals is None:
! 51         return [], []
  52     return llr_data.llr_intervals[:, 0].tolist(), llr_data.llr_intervals[:, 1].tolist()
  53 
  54 
  55 def compare_striation_marks(

src/processors/router.py

  64     include_in_schema=False,
  65 )
  66 async def calculate_score_impression(impression: CalculateScore) -> ComparisonResponseImpression:
  67     """Compare two impression profiles."""
! 68     vault = create_vault(impression.tag)
  69     return ComparisonResponseImpression.generate_urls(vault.access_url)
  70 
  71 
  72 @processors.post(

@github-actions
Copy link

Code Coverage

Package Line Rate Branch Rate Health
. 96% 88%
comparators 100% 100%
computations 100% 100%
container_models 99% 100%
conversion 97% 82%
conversion.export 100% 100%
conversion.filter 97% 89%
conversion.leveling 100% 100%
conversion.leveling.solver 100% 75%
conversion.plots 98% 88%
conversion.preprocess_impression 99% 91%
conversion.preprocess_striation 89% 58%
conversion.profile_correlator 96% 82%
conversion.surface_comparison 93% 85%
extractors 99% 75%
mutations 100% 100%
parsers 98% 80%
parsers.patches 89% 60%
preprocessors 99% 89%
processors 98% 83%
renders 98% 67%
utils 92% 75%
Summary 97% (3086 / 3174) 84% (331 / 396)

Minimum allowed line rate is 50%

knm_llr_lower_ci, knm_llr_upper_ci = _extract_intervals(reference_data.knm_llr_data)
return LRStriationResult(
log_lr=float(log_lr),
km_scores=reference_data.km_scores.tolist(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is what we want to return, why don't we just give back the whole reference_data? And I'm thinking we also want to return the interval of the new point, so the log_lr.

Copy link
Collaborator

@SimoneAriens SimoneAriens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to talk about what we want to return and for what reason

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants