Skip to content

feat: expose convergence diagnostics from VBPCA.fit() and select_n_components()#100

Merged
jc-macdonald merged 3 commits into
mainfrom
feat/convergence-diagnostics
Apr 19, 2026
Merged

feat: expose convergence diagnostics from VBPCA.fit() and select_n_components()#100
jc-macdonald merged 3 commits into
mainfrom
feat/convergence-diagnostics

Conversation

@jc-macdonald
Copy link
Copy Markdown
Collaborator

Closes #99

Summary

Expose convergence diagnostics that pca_full() already computes but VBPCA.fit() previously discarded. This enables downstream analysis of convergence behavior — particularly important for optimizing convergence parameters where wall_seconds is an unreliable proxy.

Changes

_converge.py

  • New _cost_criteria_tagged() returns (reason_tag, message) tuples instead of bare strings
  • convergence_check() stores a structured _convergence_reason tag in the learning curve dict when a criterion fires
  • Reason tags: angle, earlystop, rms_plateau, cost_plateau, cfstop_rel, cfstop_curv, composite, slowing_down

_pca_full.py

  • After the training loop, promotes _convergence_reasonconvergence_reason in lc, defaulting to "maxiters"

estimators.py

Three new fitted attributes on VBPCA:

  • n_iter_: int | None — iterations completed
  • convergence_reason_: str | None — structured reason tag or "maxiters"
  • learning_curve_: dict | None — full per-iteration history (rms, prms, cost, angle, phase timings)

model_selection.py

  • _fit_candidate() trace entries now include n_iter and convergence_reason

Tests

  • 4 new tests in test_estimators.py: diagnostics exposed after fit, maxiters reason, angle reason, None before fit
  • 1 new test in test_model_selection.py: trace entries contain convergence diagnostics

Motivation

Investigation revealed that niter_broadprior=100 (the default) suppresses convergence checks, causing 30–93% iteration waste with negligible quality improvement:

Case bp=100 bp=0 Waste Quality Δ
small (10×20) 110 68 +62% negligible
medium (50×100) 175 112 +56% 0.002 RMS
medium+missing 220 114 +93% 0.001 RMSE
large (100×200) 147 114 +29%

These diagnostics enable data-driven optimization of niter_broadprior and convergence thresholds.

…mponents()

Closes #99

- _converge.py: convergence_check() now stores a structured reason tag
  (_convergence_reason) in the learning curve dict via _cost_criteria_tagged()
- _pca_full.py: promotes _convergence_reason to convergence_reason in lc
  after the training loop, defaulting to 'maxiters'
- estimators.py: VBPCA gains n_iter_, convergence_reason_, learning_curve_
  fitted attributes populated after fit()
- model_selection.py: trace entries include n_iter and convergence_reason
- Tests: 5 new tests covering all three layers
@jc-macdonald jc-macdonald merged commit 852a6ba into main Apr 19, 2026
7 checks passed
@jc-macdonald jc-macdonald deleted the feat/convergence-diagnostics branch April 19, 2026 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expose convergence diagnostics from VBPCA.fit() and select_n_components()

1 participant