[ENH] Add Probabilistic boosting and stacking compositors#993
[ENH] Add Probabilistic boosting and stacking compositors#993arnavk23 wants to merge 5 commits intosktime:mainfrom
Conversation
…e all imports are at the top of the test file to resolve UnboundLocalError. Clean up test file for robust import handling.
JiwaniZakir
left a comment
There was a problem hiding this comment.
In _predict_proba of ProbabilisticStackingRegressor, when meta_learner_ is present, the code constructs Mixture(distributions=[("meta", meta_pred)], weights=[1.0]) where meta_pred is either a raw numpy array from predict_proba or a 1D array from predict — neither is a skpro distribution object. Mixture expects its distributions argument to contain actual distribution instances, so this path will almost raise a runtime error and appears untested by the get_test_params scenarios (which don't exercise the meta_learner path at all).
Additionally, _fit builds meta-features via est_fitted.predict(X).values.flatten() (point predictions), which discards all distributional information when training the meta-learner. For a "probabilistic stacking" regressor, you'd typically want to include at minimum both the predicted mean and variance (or quantiles) as features, otherwise the meta-learner has no basis for producing calibrated uncertainty estimates.
The add_base_estimator method mutates self.estimators in-place post-construction, which conflicts with skpro/sklearn's convention that constructor parameters are not mutated — cloning the estimator after calling this method would not preserve the added estimator, breaking pipeline serialization and cross-validation workflows.
|
Saw the same after my recent commit, trying to fix it. But thanks for the in detail review. |
…ying to fix them. I think the issue is that the test is not properly setting up the data or the model, which is causing the predictions to be all zeros. I will try to debug this by adding some print statements to see what is going on with the data and the model. I will also check if there are any issues with the way the ensemble is being created or used in the test.
Reference Issues/PRs
Towards #7
What does this implement/fix? Explain your changes.
ProbabilisticStackingRegressorandProbabilisticBoostingRegressoras composable pipeline elements for probabilistic ensembling.Does your contribution introduce a new dependency? If yes, which one?
No new dependencies are introduced.
What should a reviewer concentrate their feedback on?
Did you add any tests for the change?
Yes.
Any other comments?
PR checklist
For all contributions
How to: add yourself to the all-contributors file in the
skproroot directory (not theCONTRIBUTORS.md). Common badges:code- fixing a bug, or adding code logic.doc- writing or improving documentation or docstrings.bug- reporting or diagnosing a bug (get this pluscodeif you also fixed the bug in the PR).maintenance- CI, test framework, release.See here for full badge reference
For new estimators
docs/source/api_reference/taskname.rst, follow the pattern.Examplessection.python_dependenciestag and ensureddependency isolation, see the estimator dependencies guide.