Skip to content

Latest commit

 

History

History
110 lines (75 loc) · 2.37 KB

File metadata and controls

110 lines (75 loc) · 2.37 KB

Forecasting

A modular Python package for time series forecasting, entropy-rate estimation, and geospatial distance computation.

This project includes:

  • Lempel-Ziv-based entropy rate estimators (for strings or symbolic sequences)
  • SARIMAX forecasting with optional endogenous (weekly patterns) and exogenous drivers
  • Haversine formula for computing geodesic distances

Installation

pip install -e .

This installs the forecasting package from the src/ directory in editable mode (requires Python ≥ 3.8).


Package Structure

src/forecasting/
├── entropy.py       # Entropy rate estimators
├── forecasting.py   # SARIMAX-based time series forecasting
├── geo.py           # Haversine distance utility
├── substring.py     # Substring/pattern match utilities
├── __init__.py

Features

1. Entropy Rate Estimation

from forecasting.entropy import get_entropy_rate_str, get_entropy_rate_fast, get_entropy_rate_lz

seq = 'abcabcabcabc'
rate = get_entropy_rate_str(seq)

sym_seq = ['1', '3', '5', '5', '0', '10', '27']
rate_fast = get_entropy_rate_fast(sym_seq)
rate_lz = get_entropy_rate_lz(sym_seq)

2. SARIMAX Forecasting Pipeline

Forecasts a time series using SARIMAX, with optional drivers.

from forecasting.forecasting import run_sarimax_pipeline

results = run_sarimax_pipeline(
    file_name="data1.csv",
    dt="00:10:00",
    dt_string="10min",
    int_pred="02:00:00",
    int_pred_string="2h",
    endo_drivers="Weekly",       # or "No"
    ex_drivers="data_weather.csv"  # or "No"
)

The input CSV (file_name) must contain:

value,date,time
39.976242,2007-11-30,14:34:51
39.976243,2007-11-30,14:34:52
...

Optional Exogenous Drivers: data_weather.csv must follow the same format.

The model:

  • Bins and interpolates time series data
  • Splits into train/test based on int_pred
  • Optionally models weekly patterns and external influences
  • Searches best SARIMAX params (AIC-minimization)
  • Saves prediction to CSV and PNG plot (if plot=True)

3. Geospatial Distance

from forecasting.geo import haversine

dist = haversine(lon1=12.49, lat1=41.89, lon2=2.29, lat2=48.85)  # meters

License

This project is licensed under the MIT License.


Author

Developed by Valeria D'Andrea Refactored and modularized for packaging and reuse.