Skip to content

Releases: TomCardeLo/boa-forecaster

v2.4.0 — Probabilistic, Regulatory & Deep-Learning Horizons

22 Apr 22:45

Choose a tag to compare

Track H release closing the post-v2.3 backlog plus 2026-04-20 feedback
from the CAR PM2.5 hourly pipeline (tasks/feedback_aire.md).
No breaking API changes — additive and behaviour-tightening only.

Highlights

New model families

  • ProphetSpec (H1) — Meta's Prophet for trend + seasonality + holidays. Behind the prophet extra.
  • QuantileMLSpec (H2) — probabilistic forecasts via LightGBM/XGBoost quantile objectives + new metrics_probabilistic.py (pinball_loss, interval_coverage).
  • LSTMSpec (H3) — PyTorch LSTM baseline behind the deep extra (deliberately not in [all]).

Regulatory metrics & presets (feedback_aire §2)

  • hit_rate_weighted + f1_by_bucket in core metrics.py (H7-core).
  • New presets/air_quality.pyICA_EDGES_PM25_CO2017, ICA_EDGES_PM25_USAQI, hit_rate_ica, hit_rate_ica_weighted (H7-presets).
  • First preset pack — opens the door for presets/demand.py, presets/energy.py, presets/finance.py in v2.5+.

Hourly SARIMA (feedback_aire §1 residual)

  • SARIMASpec.for_frequency(freq) classmethod with frequency-aware seasonal_period defaults; tuneable [24, 168] on hourly data (H8).

Ensemble safety & high-volatility WMA (feedback_aire §3 + §4)

  • EnsembleSpec warns when inverse_cv_loss mixes early-stopping and full-fold members (H9a). New uses_early_stopping flag on ModelSpec.
  • WMA_THRESHOLD_HIGH_VOLATILITY = 3.5 named constant for peaky series (H9b).

Post-training seasonal bias correction (feedback_aire §5)

  • postprocess.pycompute_seasonal_bias + apply_seasonal_bias mirroring CAR's production sesgo_mensual_para_ajuste.csv pattern (H5).
  • optimize_model(..., apply_bias_correction=True) opt-in kwarg + CLI --bias-correction flag.

Pydantic finishing touches

  • BoaConfig.from_dict, Literal validators on sub-models, --strict CLI flag to flip extra="allow"extra="forbid" (H4).

Release gate

  • 814 tests passed, 91% coverage
  • ruff / black / bandit (no high/medium): clean
  • boa-forecaster run --strict smoke-validates config.example.yaml

Contributors

Track H executed by parallel Sonnet implementer subagents under Opus orchestration; reviews by Opus code-reviewer.

Special thanks to Daniel Méndez and the CAR / Cundinamarca PM2.5 hourly pipeline team (Bogotá + Cundinamarca, 34 monitoring stations, 2016–2026) for the 2026-04-20 feedback that drove H5, H7, H8, and H9.

Full diff: v2.3.0...v2.4.0

v2.3.0 — Correctness & Ecosystem

20 Apr 19:53

Choose a tag to compare

Correctness & ecosystem release bundling Tracks E/F/G of the post-v2.2 plan.

Four silent-correctness bugs fixed, quality-hardening touch-ups on validation/metric/preprocessor, and small ecosystem primitives surfaced by a real-world consumer. No breaking API changes.

Highlights

Fixed — Correctness (Track E)

  • EnsembleSpec.needs_features now a @property reflecting members (#17)
  • BaseMLSpec auto-injects forecast_horizon into default FeatureConfig.lag_periods (#17)
  • Optuna MedianPruner(n_startup_trials=5, n_warmup_steps=1) wired into optimize_model — 20–40% faster TPE (#17)

Performance (Track E)

  • build_ensemble parallelised via joblib.Parallel with new n_jobs kwarg — ~75% faster on 4-member ensembles (#17)

Quality hardening (Track F)

  • New hit_rate(y_true, y_pred, edges) metric for bucket-accuracy reporting (#16)
  • New flag_intermittent(df, group_cols, value_col, threshold=0.7) preprocessor helper (#16)
  • walk_forward_validation now accepts n_folds >= 1 and forecast_horizon= default for test_size (#16)
  • combined_metric delegates to build_combined_metric, so register_metric affects both paths (#16)

Ecosystem (Track G)

  • FeatureConfig.for_frequency(freq, **overrides) classmethod — MS / W / D / h defaults (#15)
  • EnsembleSpec docstring: inverse_cv_loss weighting caveats (#17)

See CHANGELOG.md for the full entry.

v2.2.0 — Tracks A/B/C/D: release hygiene, coverage, perf, extensibility

20 Apr 18:41

Choose a tag to compare

Additive release on the v2.x line bundling Tracks A / B / C / D of the post-v2.1.0 plan plus the A2 vectorised batch API. No breaking changes — sarima_bayes shim still emits DeprecationWarning and remains importable.

Highlights

Extensibility (Track D, #13)

  • Click CLIboa-forecaster run | compare | validate (also python -m boa_forecaster). See docs/cli.md.
  • Pydantic v2 config schema — strongly-typed validation of config.yaml at load time.
  • EnsembleSpec — weighted or stacked ensemble over any registered ModelSpecs. See docs/ensemble.md.

Performance (Track C, #14)

  • Deterministic feature cache — calendar/trend features computed once per series, reused across walk-forward folds. ~30% speedup on 60-month series × 10 folds.
  • Parallel walk-forward CVwalk_forward_validation(..., n_jobs=1) via joblib.Parallel(backend="loky"); default preserves sequential behaviour.
  • np.isinf inf-check (optimizer._validate_series) — short-circuits on first inf, ~10–20× faster than the prior series.isin([np.inf, -np.inf]).any().
  • pytest-benchmark regression suite (tests/perf/) — weekly CI job compares against committed baseline.

Coverage (Track B, #12)

  • data_loader.py100% (new test_data_loader_errors.py)
  • validation.py98% (expanded test_validation.py, includes n_jobs=2)
  • benchmarks.py95% (new test_benchmarks_v2.py)

Release hygiene (Track A, #11)

  • Security scan CI step on push/PR.
  • test_optional_deps.py — asserts XGBoost/LightGBM specs degrade cleanly when extras are missing.
  • Internal cleanup of duplicate files inside the sarima_bayes/ shim (public shim surface preserved).

Performance (A2, #9)

  • weighted_moving_stats_batch — vectorised multi-series clipping.

Fixed

  • mypy errors on Python 3.11 CI.

Public API additions

  • boa_forecaster.EnsembleSpec
  • boa_forecaster.cli + boa-forecaster console entry point
  • boa_forecaster.config_schema (Pydantic models)
  • weighted_moving_stats_batch
  • walk_forward_validation(..., n_jobs=1)

Full changelog

See CHANGELOG.md and the compare view.

v2.1.0 — Phase A–E improvements on the v2 framework

17 Apr 22:30

Choose a tag to compare

Feature release on the v2.x line. Ships the full Phase A–E improvement plan (perf, tests, code quality, CI, docs) on top of the v2.0.0 framework foundation. No breaking API changes since v2.0.0 — additions and deprecations only.

Migration note. import sarima_bayes continues to work via a compatibility shim that re-exports the entire boa_forecaster API and emits a DeprecationWarning. pred_arima, forecast_arima, and optimize_arima also keep working but warn — they will be removed in v3.0.

Highlights

Reliability & observability

  • OptimizationResult.is_fallback: bool distinguishes a genuine optimum from a warm-start returned after a study-level crash; crash now logs at WARNING with exc_info=True instead of being silently swallowed. See ADR-002.
  • Thread-safe METRIC_REGISTRY via threading.Lock.
  • SARIMASpec.MAX_NON_SEASONAL_ORDER / MAX_SEASONAL_ORDER named constants replacing magic 4 / 3 thresholds.

Performance

  • weighted_moving_stats vectorised — new weighted_moving_stats_series helper using sliding_window_view. 18–130× faster, mathematically identical output.
  • fill_blanks vectorisedMultiIndex.from_product + reindex instead of cross-join + merge. ~1.2–1.5× faster, lower peak memory. (Behaviour change: duplicate (date, group) rows are now summed; pipelines running clean_zeros first are unaffected.)
  • recursive_forecast pre-allocated — 5–20× speedup on long horizons.
  • _validate_series early-exit via series.isin([np.inf, -np.inf]).any().

Code quality

  • BaseMLSpec shared base for tree-based ML specs. Removes ~329 lines of duplication across RandomForestSpec / XGBoostSpec / LightGBMSpec. Subclasses override only _fit_final, search_space, warm_starts.
  • Type-annotation completeness pass across models/base.py, validation.py, features.py, data_loader.py.

Tests

  • SARIMA constraint enforcement (test_sarima_constraints.py).
  • Feature-leakage regression tests (test_features.py).
  • Benchmark silent-failure tests.
  • Full-pipeline integration test (tests/integration/test_full_pipeline.py).
  • 19 Hypothesis property-based metric tests (test_metrics_property.py).
  • Optimizer 500-pt stress test (test_optimizer_stress.py, @pytest.mark.slow, < 30 s budget).

CI & tooling

  • mypy static type checking on Python 3.11 matrix entry.
  • Weekly slow-test job (Mondays 06:00 UTC, [dev,ml] extras, 20-min timeout).
  • Coverage threshold --cov-fail-under=80 on core + ML jobs.
  • hypothesis>=6.0 added to [dev] extras.

Documentation

  • Architecture Decision Records (docs/adr/):
    • ADR-001 — ModelSpec as Protocol, not ABC
    • ADR-002 — Optimizer soft-failure (is_fallback)
    • ADR-003 — Combined objective 0.7·sMAPE + 0.3·RMSLE
  • Extension guide (docs/extending_models.md) — end-to-end walkthrough with a worked Prophet example, BaseMLSpec shortcut for tree models, test checklist, and pitfalls table.
  • Documented rationale for decaying weights [0.3, 0.2, 0.1] in standardization.py.

Deprecations

  • pred_arima, forecast_arima, optimize_arima — emit DeprecationWarning; removal in v3.0.
  • sarima_bayes package — emits DeprecationWarning on import; re-exports everything from boa_forecaster.

Full changelog: v2.0.0...v2.1.0

v2.0.0 — Multi-model forecasting framework

26 Mar 21:17

Choose a tag to compare

What's new

v2.0 turns the library from a SARIMA-only tool into a pluggable multi-model forecasting framework.

New models

  • Random Forest (RandomForestSpec) — scikit-learn, always available
  • XGBoost (XGBoostSpec) — optional extra: pip install -e ".[xgboost]"
  • LightGBM (LightGBMSpec) — optional extra: pip install -e ".[lightgbm]"

New API

  • optimize_model(series, model_spec, n_trials) — unified entry point for any model
  • ModelSpec protocol — add a new model in ~50 lines
  • FeatureEngineer — lags, rolling stats, calendar, trend features for ML models
  • run_model_comparison() — multi-model head-to-head comparison

Infrastructure

  • Primary package renamed to boa_forecaster; sarima_bayes is a deprecated compatibility shim (fully backward-compatible, emits DeprecationWarning)
  • CI split into test-core-only (Python 3.9/3.10/3.11) and test-ml-extras (Python 3.11 + ML libs)
  • 368 unit tests + integration tests

Backward compatibility

All v1.x code continues to work:

from sarima_bayes import optimize_arima, forecast_arima  # emits DeprecationWarning

Recommended migration:

from boa_forecaster import optimize_model
from boa_forecaster.models import SARIMASpec

result = optimize_model(series, SARIMASpec(), n_trials=30)

Installation

pip install -e "."          # core (SARIMA + Random Forest)
pip install -e ".[ml]"      # + XGBoost + LightGBM
pip install -e ".[dev,ml]"  # + dev tools

v1.4.0 — Optional Country/SKU columns

24 Mar 20:23

Choose a tag to compare

What's new

  • Optional Country and SKU columns — the data loader now accepts input files with or without these columns. When absent, the pipeline treats the entire dataset as a single group and skips per-group filtering.
  • Removed merge_representatives — the preprocessor no longer exposes this helper; grouping logic is handled transparently by the loader.

Breaking changes

None. Existing inputs with Country/SKU columns continue to work unchanged.

Upgrade

pip install --upgrade boa-sarima-forecaster

v1.3.0 — Configurable Metric Composition

24 Mar 15:32

Choose a tag to compare

What's new

The Bayesian optimiser objective is now fully configurable. Instead of being locked to 0.7 × sMAPE + 0.3 × RMSLE, any weighted combination of built-in metrics can be used — making the library applicable beyond demand forecasting.

New metrics

Name Formula Best suited for
mae mean(|y − ŷ|) Revenue, price — absolute scale matters
rmse √mean((y − ŷ)²) Penalises large deviations
mape 100 × mean(|y − ŷ| / (|y| + ε)) Clean series without zeros

New API

  • METRIC_REGISTRY — dict mapping metric names to callables
  • build_combined_metric(components) — factory that builds any weighted objective
  • optimize_arima(..., metric_components=[...]) — pass a custom objective directly

Configuration

metrics:
  components:
    - metric: smape
      weight: 0.7
    - metric: rmsle
      weight: 0.3

Backward compatibility

Default behaviour (0.7 × sMAPE + 0.3 × RMSLE) is unchanged. All existing call sites continue to work without modification.

Changes

  • src/sarima_bayes/metrics.pymae, rmse, mape, METRIC_REGISTRY, build_combined_metric
  • src/sarima_bayes/config.pyDEFAULT_METRIC_COMPONENTS
  • config.example.yamlmetrics.components section
  • src/sarima_bayes/optimizer.pymetric_components kwarg
  • src/sarima_bayes/__init__.py — new public exports
  • tests/unit/test_metrics.py — 15 new tests (96 total, 100% metrics coverage)
  • README.md — new Configurable Metric section

v1.2.0 — Configurable Time-Series Frequency

24 Mar 00:36

Choose a tag to compare

What's Changed

Added

  • Configurable frequency — the pipeline now works with any pandas DateOffset alias, not only monthly "MS". Pass freq to set the sampling rate and m for the seasonal period:
    • pred_arima, forecast_arima, forecast_arima_with_group — new freq: str = "MS" parameter
    • validate_by_group — new freq: str = "MS" parameter
    • ets_model — new m: int = 12 parameter
    • auto_arima_nixtla — new m: int = 12 and freq: str = "MS" parameters
    • run_benchmark_comparison — new m: int = 12 and freq: str = "MS" parameters, forwarded to all baselines
  • _freq_to_period_alias helper in preprocessor.py — maps DateOffset aliases ("MS", "W", "D", "H") to Period aliases required by pd.Series.dt.to_period()
  • data.freq key in config.example.yaml with alias/seasonal_period coupling table
  • 27 new teststest_preprocessor.py (19), new benchmark and validation coverage (8)

Changed

  • preprocessor.fill_blanks — date normalisation is now freq-aware; weekly ("W") uses end-of-period convention to align with pd.date_range Sunday anchoring
  • config.example.yamlmodel.sarima.seasonal_period comment shows recommended m per frequency

Backward Compatibility

All new parameters default to freq="MS" / m=12. Zero existing call sites require changes.

Usage Examples

# Weekly data, annual seasonality
fill_blanks(df, freq="W")
pred_arima(df, "Date", "Sales", order=(1,1,1), freq="W")
run_benchmark_comparison(df, ..., freq="W", m=52)

# Daily data, weekly seasonality
ets_model(train, forecast_horizon=7, m=7)
validate_by_group(df, ..., freq="D", n_folds=3, test_size=7, min_train_size=28)

Full Changelog

v1.1.0...v1.2.0

v1.1.0 — Configurable Outlier Clipping Threshold

24 Mar 00:03

Choose a tag to compare

What's Changed

Added

  • Configurable outlier-clipping thresholdclip_outliers and weighted_moving_stats now accept a threshold parameter (default 2.5). Previously the σ multiplier was hard-coded; it can now be set per-call or globally via config.yaml under standardization.threshold.

Changed

  • config.example.yaml — added standardization.threshold: 2.5 key so users can tune sensitivity without touching source code.
  • docs/methodology.md — updated standardisation section to document the new parameter.

Fixed

  • Renamed internal parameter sigma_thresholdthreshold in clip_outliers to match the public API expected by the test suite.
  • Resolved ruff lint errors and applied black auto-formatting to config.py that were blocking CI.

Full Changelog

v1.0.0...v1.1.0

v1.0.0 — Initial public release

18 Mar 04:41

Choose a tag to compare

1.0.0 — 2026-03-17

Added

  • SARIMA + Bayesian Optimisation pipeline — end-to-end demand forecasting using
    Optuna TPE to search ARIMA orders (p, d, q) and seasonal orders (P, D, Q, m).
  • Walk-forward (expanding-window) cross-validation — prevents look-ahead bias by
    evaluating each fold on true out-of-sample periods.
  • Benchmark comparison — walk-forward results compared against Seasonal Naïve,
    ETS (Holt-Winters), and AutoARIMA (statsforecast) baselines.
  • Weighted moving-average outlier standardisation — clips demand observations to
    ±1σ of their neighbourhood; both raw and adjusted series are modelled and the better
    one is selected automatically.
  • sMAPE and RMSLE metrics — combined cost function 0.7 × sMAPE + 0.3 × RMSLE
    used as the Optuna objective; both metrics available individually via sarima_bayes.metrics.
  • Demo notebook (notebooks/demo.ipynb) — end-to-end walkthrough using synthetic
    data; no real data required.
  • pytest test suite (tests/) — unit and integration tests with coverage reporting.
  • GitHub Actions CI (.github/workflows/ci.yml) — runs linting (ruff, black) and
    the full test suite on every push and pull request.
  • Full type hints and Google-style docstrings — all 19 public functions across
    src/sarima_bayes/ annotated with Python 3.10+ X | Y union syntax, Args, Returns,
    Raises, and Example sections.
  • config.yaml / config.example.yaml — YAML-driven configuration for data paths,
    optimisation budget, forecast horizon, and output location.
  • docs/methodology.md — detailed technical description of the five-stage pipeline.
  • Forecast plot (docs/img/forecast_example.png) — example output image showing
    training history, last-24-months actuals, point forecast, and 80%/95% CI bands;
    generated reproducibly via scripts/generate_plots.py.