[BUG] Incorrect variance computation and misleading capability tags in Hurdle

**Describe the bug**
The Hurdle distribution contains two independent but related issues:

1. Incorrect variance formula in _var The current implementation computes: p * var_positive + p * (1 - p) * mean_positive However, the correct variance for a hurdle distribution is:
$$
\mathrm{Var}(Y) = p\sigma^2 + p(1-p)\mu^2
$$
The mean term is not squared in the implementation, violating the law of total variance and leading to severely underestimated variance.

2. Incorrect capability tagging Hurdle determines whether mean and var are exact by inspecting the base distribution (e.g., Normal). However, internally it wraps this distribution in LeftTruncated, which: 
does not implement exact mean/var instead relies on numerical integration (PPF-based approximation)
As a result: Hurdle advertises exact capabilities but actually produces approximate results and triggers runtime warnings This leads to misleading API behavior and inconsistent capability metadata.

**To Reproduce**
```python
import numpy as np
from skpro.distributions.normal import Normal
from skpro.distributions.hurdle import Hurdle

base = Normal(mu=10.0, sigma=1.0)
hurdle = Hurdle(p=0.5, distribution=base)

np.random.seed(42)
samples = hurdle.sample(100_000)
empirical_var = float(samples.var().iloc[0])

skpro_var = float(hurdle.var().iloc[0, 0])

print("Empirical variance:", empirical_var)
print("skpro variance:", skpro_var)
```

screenshot of the issue is attached below:

<img width="1429" height="887" alt="Image" src="https://github.com/user-attachments/assets/62416d5b-4abd-417c-888b-51d69c5ffce7" />





**Expected behavior**
Variance should follow:

$$
\mathrm{Var}(Y) = p\sigma^2 + p(1-p)\mu^2
$$

Empirical and analytical variance should match within numerical tolerance Hurdle should correctly advertise mean and var as approximate, not exact


**Environment**
OS: macOS
Python: 3.x
skpro: latest (main branch)
NumPy / Pandas: standard versions


**Additional context**
The discrepancy is structural, not due to numerical approximation The related ZeroInflated distribution correctly implements this variance formula, indicating inconsistency
The issue impacts: uncertainty estimation ,probabilistic metrics ,downstream model evaluation


**Proposed fix**
Correct variance formula:
return ( self.p * var_positive+ (mean_positive ** 2) * self.p * (1.0 - self.p))
Fix capability tagging. Inspect LeftTruncated(distribution) instead of the raw base distribution Downgrade mean and var to approximate capabilities


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Incorrect variance computation and misleading capability tags in Hurdle #972

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Incorrect variance computation and misleading capability tags in Hurdle #972

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions