Releases · TomCardeLo/boa-forecaster

22 Apr 22:45

v2.4.0

3b46e78

v2.4.0 — Probabilistic, Regulatory & Deep-Learning Horizons Latest

Latest

Track H release closing the post-v2.3 backlog plus 2026-04-20 feedback
from the CAR PM2.5 hourly pipeline (tasks/feedback_aire.md).
No breaking API changes — additive and behaviour-tightening only.

Highlights

New model families

ProphetSpec (H1) — Meta's Prophet for trend + seasonality + holidays. Behind the prophet extra.
QuantileMLSpec (H2) — probabilistic forecasts via LightGBM/XGBoost quantile objectives + new metrics_probabilistic.py (pinball_loss, interval_coverage).
LSTMSpec (H3) — PyTorch LSTM baseline behind the deep extra (deliberately not in [all]).

Regulatory metrics & presets (feedback_aire §2)

hit_rate_weighted + f1_by_bucket in core metrics.py (H7-core).
New presets/air_quality.py — ICA_EDGES_PM25_CO2017, ICA_EDGES_PM25_USAQI, hit_rate_ica, hit_rate_ica_weighted (H7-presets).
First preset pack — opens the door for presets/demand.py, presets/energy.py, presets/finance.py in v2.5+.

Hourly SARIMA (feedback_aire §1 residual)

SARIMASpec.for_frequency(freq) classmethod with frequency-aware seasonal_period defaults; tuneable [24, 168] on hourly data (H8).

Ensemble safety & high-volatility WMA (feedback_aire §3 + §4)

EnsembleSpec warns when inverse_cv_loss mixes early-stopping and full-fold members (H9a). New uses_early_stopping flag on ModelSpec.
WMA_THRESHOLD_HIGH_VOLATILITY = 3.5 named constant for peaky series (H9b).

Post-training seasonal bias correction (feedback_aire §5)

postprocess.py — compute_seasonal_bias + apply_seasonal_bias mirroring CAR's production sesgo_mensual_para_ajuste.csv pattern (H5).
optimize_model(..., apply_bias_correction=True) opt-in kwarg + CLI --bias-correction flag.

Pydantic finishing touches

BoaConfig.from_dict, Literal validators on sub-models, --strict CLI flag to flip extra="allow" → extra="forbid" (H4).

Release gate

814 tests passed, 91% coverage
ruff / black / bandit (no high/medium): clean
boa-forecaster run --strict smoke-validates config.example.yaml

Contributors

Track H executed by parallel Sonnet implementer subagents under Opus orchestration; reviews by Opus code-reviewer.

Special thanks to Daniel Méndez and the CAR / Cundinamarca PM2.5 hourly pipeline team (Bogotá + Cundinamarca, 34 monitoring stations, 2016–2026) for the 2026-04-20 feedback that drove H5, H7, H8, and H9.

Full diff: v2.3.0...v2.4.0

Assets 2

20 Apr 19:53

TomCardeLo

v2.3.0

b2c2ff8

v2.3.0 — Correctness & Ecosystem

Correctness & ecosystem release bundling Tracks E/F/G of the post-v2.2 plan.

Four silent-correctness bugs fixed, quality-hardening touch-ups on validation/metric/preprocessor, and small ecosystem primitives surfaced by a real-world consumer. No breaking API changes.

Highlights

Fixed — Correctness (Track E)

EnsembleSpec.needs_features now a @property reflecting members (#17)
BaseMLSpec auto-injects forecast_horizon into default FeatureConfig.lag_periods (#17)
Optuna MedianPruner(n_startup_trials=5, n_warmup_steps=1) wired into optimize_model — 20–40% faster TPE (#17)

Performance (Track E)

build_ensemble parallelised via joblib.Parallel with new n_jobs kwarg — ~75% faster on 4-member ensembles (#17)

Quality hardening (Track F)

New hit_rate(y_true, y_pred, edges) metric for bucket-accuracy reporting (#16)
New flag_intermittent(df, group_cols, value_col, threshold=0.7) preprocessor helper (#16)
walk_forward_validation now accepts n_folds >= 1 and forecast_horizon= default for test_size (#16)
combined_metric delegates to build_combined_metric, so register_metric affects both paths (#16)

Ecosystem (Track G)

FeatureConfig.for_frequency(freq, **overrides) classmethod — MS / W / D / h defaults (#15)
EnsembleSpec docstring: inverse_cv_loss weighting caveats (#17)

See CHANGELOG.md for the full entry.

Assets 2

20 Apr 18:41

TomCardeLo

v2.2.0

df3580b

v2.2.0 — Tracks A/B/C/D: release hygiene, coverage, perf, extensibility

Additive release on the v2.x line bundling Tracks A / B / C / D of the post-v2.1.0 plan plus the A2 vectorised batch API. No breaking changes — sarima_bayes shim still emits DeprecationWarning and remains importable.

Highlights

Extensibility (Track D, #13)

Click CLI — boa-forecaster run | compare | validate (also python -m boa_forecaster). See docs/cli.md.
Pydantic v2 config schema — strongly-typed validation of config.yaml at load time.
EnsembleSpec — weighted or stacked ensemble over any registered ModelSpecs. See docs/ensemble.md.

Performance (Track C, #14)

Deterministic feature cache — calendar/trend features computed once per series, reused across walk-forward folds. ~30% speedup on 60-month series × 10 folds.
Parallel walk-forward CV — walk_forward_validation(..., n_jobs=1) via joblib.Parallel(backend="loky"); default preserves sequential behaviour.
np.isinf inf-check (optimizer._validate_series) — short-circuits on first inf, ~10–20× faster than the prior series.isin([np.inf, -np.inf]).any().
pytest-benchmark regression suite (tests/perf/) — weekly CI job compares against committed baseline.

Coverage (Track B, #12)

data_loader.py → 100% (new test_data_loader_errors.py)
validation.py → 98% (expanded test_validation.py, includes n_jobs=2)
benchmarks.py → 95% (new test_benchmarks_v2.py)

Release hygiene (Track A, #11)

Security scan CI step on push/PR.
test_optional_deps.py — asserts XGBoost/LightGBM specs degrade cleanly when extras are missing.
Internal cleanup of duplicate files inside the sarima_bayes/ shim (public shim surface preserved).

Performance (A2, #9)

weighted_moving_stats_batch — vectorised multi-series clipping.

Fixed

mypy errors on Python 3.11 CI.

Public API additions

boa_forecaster.EnsembleSpec
boa_forecaster.cli + boa-forecaster console entry point
boa_forecaster.config_schema (Pydantic models)
weighted_moving_stats_batch
walk_forward_validation(..., n_jobs=1)

Full changelog

See CHANGELOG.md and the compare view.

Assets 2

17 Apr 22:30

TomCardeLo

v2.1.0

561eef1

v2.1.0 — Phase A–E improvements on the v2 framework

Feature release on the v2.x line. Ships the full Phase A–E improvement plan (perf, tests, code quality, CI, docs) on top of the v2.0.0 framework foundation. No breaking API changes since v2.0.0 — additions and deprecations only.

Migration note. import sarima_bayes continues to work via a compatibility shim that re-exports the entire boa_forecaster API and emits a DeprecationWarning. pred_arima, forecast_arima, and optimize_arima also keep working but warn — they will be removed in v3.0.

Highlights

Reliability & observability

OptimizationResult.is_fallback: bool distinguishes a genuine optimum from a warm-start returned after a study-level crash; crash now logs at WARNING with exc_info=True instead of being silently swallowed. See ADR-002.
Thread-safe METRIC_REGISTRY via threading.Lock.
SARIMASpec.MAX_NON_SEASONAL_ORDER / MAX_SEASONAL_ORDER named constants replacing magic 4 / 3 thresholds.

Performance

weighted_moving_stats vectorised — new weighted_moving_stats_series helper using sliding_window_view. 18–130× faster, mathematically identical output.
fill_blanks vectorised — MultiIndex.from_product + reindex instead of cross-join + merge. ~1.2–1.5× faster, lower peak memory. (Behaviour change: duplicate (date, group) rows are now summed; pipelines running clean_zeros first are unaffected.)
recursive_forecast pre-allocated — 5–20× speedup on long horizons.
_validate_series early-exit via series.isin([np.inf, -np.inf]).any().

Code quality

BaseMLSpec shared base for tree-based ML specs. Removes ~329 lines of duplication across RandomForestSpec / XGBoostSpec / LightGBMSpec. Subclasses override only _fit_final, search_space, warm_starts.
Type-annotation completeness pass across models/base.py, validation.py, features.py, data_loader.py.

Tests

SARIMA constraint enforcement (test_sarima_constraints.py).
Feature-leakage regression tests (test_features.py).
Benchmark silent-failure tests.
Full-pipeline integration test (tests/integration/test_full_pipeline.py).
19 Hypothesis property-based metric tests (test_metrics_property.py).
Optimizer 500-pt stress test (test_optimizer_stress.py, @pytest.mark.slow, < 30 s budget).

CI & tooling

mypy static type checking on Python 3.11 matrix entry.
Weekly slow-test job (Mondays 06:00 UTC, [dev,ml] extras, 20-min timeout).
Coverage threshold --cov-fail-under=80 on core + ML jobs.
hypothesis>=6.0 added to [dev] extras.

Documentation

Architecture Decision Records (docs/adr/):
- ADR-001 — ModelSpec as Protocol, not ABC
- ADR-002 — Optimizer soft-failure (is_fallback)
- ADR-003 — Combined objective 0.7·sMAPE + 0.3·RMSLE
Extension guide (docs/extending_models.md) — end-to-end walkthrough with a worked Prophet example, BaseMLSpec shortcut for tree models, test checklist, and pitfalls table.
Documented rationale for decaying weights [0.3, 0.2, 0.1] in standardization.py.

Deprecations

pred_arima, forecast_arima, optimize_arima — emit DeprecationWarning; removal in v3.0.
sarima_bayes package — emits DeprecationWarning on import; re-exports everything from boa_forecaster.

Full changelog: v2.0.0...v2.1.0

Assets 2

26 Mar 21:17

TomCardeLo

v2.0.0

5da9fc9

v2.0.0 — Multi-model forecasting framework

What's new

v2.0 turns the library from a SARIMA-only tool into a pluggable multi-model forecasting framework.

New models

Random Forest (RandomForestSpec) — scikit-learn, always available
XGBoost (XGBoostSpec) — optional extra: pip install -e ".[xgboost]"
LightGBM (LightGBMSpec) — optional extra: pip install -e ".[lightgbm]"

New API

optimize_model(series, model_spec, n_trials) — unified entry point for any model
ModelSpec protocol — add a new model in ~50 lines
FeatureEngineer — lags, rolling stats, calendar, trend features for ML models
run_model_comparison() — multi-model head-to-head comparison

Infrastructure

Primary package renamed to boa_forecaster; sarima_bayes is a deprecated compatibility shim (fully backward-compatible, emits DeprecationWarning)
CI split into test-core-only (Python 3.9/3.10/3.11) and test-ml-extras (Python 3.11 + ML libs)
368 unit tests + integration tests

Backward compatibility

All v1.x code continues to work:

from sarima_bayes import optimize_arima, forecast_arima  # emits DeprecationWarning

Recommended migration:

from boa_forecaster import optimize_model
from boa_forecaster.models import SARIMASpec

result = optimize_model(series, SARIMASpec(), n_trials=30)

Installation

pip install -e "."          # core (SARIMA + Random Forest)
pip install -e ".[ml]"      # + XGBoost + LightGBM
pip install -e ".[dev,ml]"  # + dev tools

Assets 2

24 Mar 20:23

TomCardeLo

v1.4.0

f9f2982

v1.4.0 — Optional Country/SKU columns

What's new

Optional Country and SKU columns — the data loader now accepts input files with or without these columns. When absent, the pipeline treats the entire dataset as a single group and skips per-group filtering.
Removed merge_representatives — the preprocessor no longer exposes this helper; grouping logic is handled transparently by the loader.

Breaking changes

None. Existing inputs with Country/SKU columns continue to work unchanged.

Upgrade

pip install --upgrade boa-sarima-forecaster

Assets 2

24 Mar 15:32

TomCardeLo

v1.3.0

2f6320e

v1.3.0 — Configurable Metric Composition

What's new

The Bayesian optimiser objective is now fully configurable. Instead of being locked to 0.7 × sMAPE + 0.3 × RMSLE, any weighted combination of built-in metrics can be used — making the library applicable beyond demand forecasting.

New metrics

Name	Formula	Best suited for
`mae`	`mean(\|y − ŷ\|)`	Revenue, price — absolute scale matters
`rmse`	`√mean((y − ŷ)²)`	Penalises large deviations
`mape`	`100 × mean(\|y − ŷ\| / (\|y\| + ε))`	Clean series without zeros

New API

METRIC_REGISTRY — dict mapping metric names to callables
build_combined_metric(components) — factory that builds any weighted objective
optimize_arima(..., metric_components=[...]) — pass a custom objective directly

Configuration

metrics:
  components:
    - metric: smape
      weight: 0.7
    - metric: rmsle
      weight: 0.3

Backward compatibility

Default behaviour (0.7 × sMAPE + 0.3 × RMSLE) is unchanged. All existing call sites continue to work without modification.

Changes

src/sarima_bayes/metrics.py — mae, rmse, mape, METRIC_REGISTRY, build_combined_metric
src/sarima_bayes/config.py — DEFAULT_METRIC_COMPONENTS
config.example.yaml — metrics.components section
src/sarima_bayes/optimizer.py — metric_components kwarg
src/sarima_bayes/__init__.py — new public exports
tests/unit/test_metrics.py — 15 new tests (96 total, 100% metrics coverage)
README.md — new Configurable Metric section

Assets 2

24 Mar 00:36

TomCardeLo

v1.2.0

43c60d8

v1.2.0 — Configurable Time-Series Frequency

What's Changed

Added

Configurable frequency — the pipeline now works with any pandas DateOffset alias, not only monthly "MS". Pass freq to set the sampling rate and m for the seasonal period:
- pred_arima, forecast_arima, forecast_arima_with_group — new freq: str = "MS" parameter
- validate_by_group — new freq: str = "MS" parameter
- ets_model — new m: int = 12 parameter
- auto_arima_nixtla — new m: int = 12 and freq: str = "MS" parameters
- run_benchmark_comparison — new m: int = 12 and freq: str = "MS" parameters, forwarded to all baselines
_freq_to_period_alias helper in preprocessor.py — maps DateOffset aliases ("MS", "W", "D", "H") to Period aliases required by pd.Series.dt.to_period()
data.freq key in config.example.yaml with alias/seasonal_period coupling table
27 new tests — test_preprocessor.py (19), new benchmark and validation coverage (8)

Changed

preprocessor.fill_blanks — date normalisation is now freq-aware; weekly ("W") uses end-of-period convention to align with pd.date_range Sunday anchoring
config.example.yaml — model.sarima.seasonal_period comment shows recommended m per frequency

Backward Compatibility

All new parameters default to freq="MS" / m=12. Zero existing call sites require changes.

Usage Examples

# Weekly data, annual seasonality
fill_blanks(df, freq="W")
pred_arima(df, "Date", "Sales", order=(1,1,1), freq="W")
run_benchmark_comparison(df, ..., freq="W", m=52)

# Daily data, weekly seasonality
ets_model(train, forecast_horizon=7, m=7)
validate_by_group(df, ..., freq="D", n_folds=3, test_size=7, min_train_size=28)

Full Changelog

v1.1.0...v1.2.0

Assets 2

24 Mar 00:03

TomCardeLo

v1.1.0

c7b5585

v1.1.0 — Configurable Outlier Clipping Threshold

What's Changed

Added

Configurable outlier-clipping threshold — clip_outliers and weighted_moving_stats now accept a threshold parameter (default 2.5). Previously the σ multiplier was hard-coded; it can now be set per-call or globally via config.yaml under standardization.threshold.

Changed

config.example.yaml — added standardization.threshold: 2.5 key so users can tune sensitivity without touching source code.
docs/methodology.md — updated standardisation section to document the new parameter.

Fixed

Renamed internal parameter sigma_threshold → threshold in clip_outliers to match the public API expected by the test suite.
Resolved ruff lint errors and applied black auto-formatting to config.py that were blocking CI.

Full Changelog

v1.0.0...v1.1.0

Assets 2

18 Mar 04:41

TomCardeLo

v1.0.0

825fcef

v1.0.0 — Initial public release

1.0.0 — 2026-03-17

Added

SARIMA + Bayesian Optimisation pipeline — end-to-end demand forecasting using
Optuna TPE to search ARIMA orders (p, d, q) and seasonal orders (P, D, Q, m).
Walk-forward (expanding-window) cross-validation — prevents look-ahead bias by
evaluating each fold on true out-of-sample periods.
Benchmark comparison — walk-forward results compared against Seasonal Naïve,
ETS (Holt-Winters), and AutoARIMA (statsforecast) baselines.
Weighted moving-average outlier standardisation — clips demand observations to
±1σ of their neighbourhood; both raw and adjusted series are modelled and the better
one is selected automatically.
sMAPE and RMSLE metrics — combined cost function 0.7 × sMAPE + 0.3 × RMSLE
used as the Optuna objective; both metrics available individually via sarima_bayes.metrics.
Demo notebook (notebooks/demo.ipynb) — end-to-end walkthrough using synthetic
data; no real data required.
pytest test suite (tests/) — unit and integration tests with coverage reporting.
GitHub Actions CI (.github/workflows/ci.yml) — runs linting (ruff, black) and
the full test suite on every push and pull request.
Full type hints and Google-style docstrings — all 19 public functions across
src/sarima_bayes/ annotated with Python 3.10+ X | Y union syntax, Args, Returns,
Raises, and Example sections.
config.yaml / config.example.yaml — YAML-driven configuration for data paths,
optimisation budget, forecast horizon, and output location.
docs/methodology.md — detailed technical description of the five-stage pipeline.
Forecast plot (docs/img/forecast_example.png) — example output image showing
training history, last-24-months actuals, point forecast, and 80%/95% CI bands;
generated reproducibly via scripts/generate_plots.py.

Assets 2

Releases: TomCardeLo/boa-forecaster

v2.4.0 — Probabilistic, Regulatory & Deep-Learning Horizons

Highlights

New model families

Regulatory metrics & presets (feedback_aire §2)

Hourly SARIMA (feedback_aire §1 residual)

Ensemble safety & high-volatility WMA (feedback_aire §3 + §4)

Post-training seasonal bias correction (feedback_aire §5)

Pydantic finishing touches

Release gate

Contributors

Uh oh!

v2.3.0 — Correctness & Ecosystem

Highlights

Uh oh!

v2.2.0 — Tracks A/B/C/D: release hygiene, coverage, perf, extensibility

Highlights

Extensibility (Track D, #13)

Performance (Track C, #14)

Coverage (Track B, #12)

Release hygiene (Track A, #11)

Performance (A2, #9)

Fixed

Public API additions

Full changelog

Uh oh!

v2.1.0 — Phase A–E improvements on the v2 framework

Highlights

Reliability & observability

Performance

Code quality

Tests

CI & tooling

Documentation

Deprecations

Uh oh!

v2.0.0 — Multi-model forecasting framework

What's new

New models

New API

Infrastructure

Backward compatibility

Installation

Uh oh!

v1.4.0 — Optional Country/SKU columns

What's new

Breaking changes

Upgrade

Uh oh!

v1.3.0 — Configurable Metric Composition

What's new

New metrics

New API

Configuration

Backward compatibility

Changes

Uh oh!

v1.2.0 — Configurable Time-Series Frequency

What's Changed

Added

Changed

Backward Compatibility

Usage Examples

Full Changelog

Uh oh!

v1.1.0 — Configurable Outlier Clipping Threshold

What's Changed

Added

Changed

Fixed

Full Changelog

Uh oh!

v1.0.0 — Initial public release

1.0.0 — 2026-03-17

Added

Uh oh!