[ENH] add truncated_mean interface to BaseDistribution by patelchaitany · Pull Request #995 · sktime/skpro

patelchaitany · 2026-03-25T18:36:57Z

Reference Issues/PRs

Addresses #221.

What does this implement/fix? Explain your changes.

Adds truncated_mean(lower, upper) to BaseDistribution. Returns E[X | lower < X < upper]. Closed-form implementations for Normal, Exponential, Laplace, and Uniform. Other distributions fall back to a ppf-based numerical approximation.

The base class _energy_x now uses truncated means when available: if a distribution has exact truncated_mean and cdf, it computes E[|X - c|] from the energy identity instead of Monte Carlo sampling.

TruncatedDistribution gets a _mean() that calls the inner distribution's truncated_mean, so TruncatedDistribution(Normal(...), lower=0).mean() is exact now.

Changes

base/_base.py - new truncated_mean / _truncated_mean with ppf/MC default, updated _energy_x
normal.py - exact _truncated_mean
exponential.py - exact _truncated_mean
laplace.py - exact _truncated_mean
uniform.py - exact _truncated_mean
truncated.py - _mean() via inner distribution's truncated_mean, updated tag logi

Does your contribution introduce a new dependency? If yes, which one?

No

What should a reviewer concentrate their feedback on?

Did you add any tests for the change?

No

Any other comments?

PR checklist

For all contributions

I've added myself to the list of contributors with any new badges I've earned :-)
How to: add yourself to the all-contributors file in the skpro root directory (not the CONTRIBUTORS.md). Common badges: code - fixing a bug, or adding code logic. doc - writing or improving documentation or docstrings. bug - reporting or diagnosing a bug (get this plus code if you also fixed the bug in the PR).maintenance - CI, test framework, release.
See here for full badge reference
The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.

For new estimators

I've added the estimator to the API reference - in docs/source/api_reference/taskname.rst, follow the pattern.
I've added one or more illustrative usage examples to the docstring, in a pydocstyle compliant Examples section.
If the estimator relies on a soft dependency, I've set the python_dependencies tag and ensured
dependency isolation, see the estimator dependencies guide.

JiwaniZakir

The sampling-based fallback in _truncated_mean calls self.sample(1) inside a loop of approx_spl_size (default 1000) iterations, which is orders of magnitude slower than calling self.sample(approx_spl_size) once and filtering. The existing _mean fallback at the end of the file does exactly this with _sample_mean, so the pattern is already established — the loop approach here is inconsistent with the rest of the codebase and will be a significant performance bottleneck.

In _energy_x, the added docstring block (around line 1235) describes the truncated-mean identity used in the new code path, but it's inserted into the existing _energy_x docstring rather than being clearly separated from the original parameter/return docs. This makes the docstring structurally confusing — the formula section appears before the "Parameters" block, which is fine, but it references _truncated_mean and cdf without explaining that this only triggers when both are exact capabilities, which is important context for subclass implementors.

The zero_mass guard in the ppf-based approximation (np.abs(cdf_u - cdf_l) < 1e-15) correctly handles degenerate intervals, but the same guard is absent in the _energy_x shortcut path — if a point x lies exactly at a probability-0 region boundary, right_mean or left_mean could return nan and the nan_to_num(..., nan=0.0) silently masks what might be a meaningful divergence rather than a true zero-contribution interval.

fkiraly

Very nice!

Can you please add a test to TestAllDistributions that checks that the method works? I think the method just has to be added to the right places.

fkiraly · 2026-03-26T22:45:28Z

+        if spl_arr.ndim == 1:
+            spl_arr = spl_arr.reshape(-1, 1, 1)
+        elif spl_arr.ndim == 2:
+            spl_arr = spl_arr.reshape(spl_arr.shape[0], 1, -1)


why is this correct? or is it a mistake?

Yeah, those reshape branches are dead code on my end. sample(1) always returns a DataFrame, so .values gives 2D, stacking gives 3D neither ndim == 1 nor ndim == 2 ever fires. I added them as defensive guards but they're unnecessary. Happy to remove them.

Signed-off-by: Chaitany patel <patelchaitany93@gmail.com>

patelchaitany · 2026-04-14T14:44:04Z

@fkiraly, I have added truncated_mean to METHODS_SCALAR plus two new tests one checks that unbounded truncated_mean() matches mean(), the other verifies output format with actual bounds.

patelchaitany requested review from felipeangelimvieira and fkiraly as code owners March 25, 2026 18:36

JiwaniZakir reviewed Mar 26, 2026

View reviewed changes

fkiraly requested changes Mar 26, 2026

View reviewed changes

fkiraly reviewed Mar 26, 2026

View reviewed changes

patelchaitany force-pushed the feature/truncated-mean-interface branch from dde2b4d to da87e95 Compare March 28, 2026 07:54

[ENH] add truncated_mean interface to BaseDistribution

4c21cf3

Signed-off-by: Chaitany patel <patelchaitany93@gmail.com>

patelchaitany force-pushed the feature/truncated-mean-interface branch from da87e95 to 4c21cf3 Compare March 28, 2026 07:59

patelchaitany requested review from JiwaniZakir and fkiraly March 28, 2026 07:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] add truncated_mean interface to BaseDistribution#995

[ENH] add truncated_mean interface to BaseDistribution#995
patelchaitany wants to merge 1 commit intosktime:mainfrom
patelchaitany:feature/truncated-mean-interface

patelchaitany commented Mar 25, 2026

Uh oh!

JiwaniZakir left a comment

Uh oh!

fkiraly left a comment

Uh oh!

fkiraly Mar 26, 2026

Uh oh!

patelchaitany Mar 27, 2026

Uh oh!

patelchaitany commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

patelchaitany commented Mar 25, 2026

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Changes

Does your contribution introduce a new dependency? If yes, which one?

What should a reviewer concentrate their feedback on?

Did you add any tests for the change?

Any other comments?

PR checklist

For all contributions

For new estimators

Uh oh!

JiwaniZakir left a comment

Choose a reason for hiding this comment

Uh oh!

fkiraly left a comment

Choose a reason for hiding this comment

Uh oh!

fkiraly Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

patelchaitany Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

patelchaitany commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants