A production-grade machine learning system for weather prediction built entirely in Rust. This project demonstrates the complete ML lifecycle from data collection to model monitoring, using Evcxr Jupyter kernel for interactive exploration.
π Live Weather Predictions
Auto-updated daily at 06:00 UTC | Last run: Pending first deployment
24-Hour, 48-Hour & 72-Hour Forecast
City
Country
Current
+24h
+48h
+72h
Rain %
Confidence
SΓ£o Paulo
π§π·
--Β°C
--Β°C
--Β°C
--Β°C
--%
--
Rio de Janeiro
π§π·
--Β°C
--Β°C
--Β°C
--Β°C
--%
--
SΓ£o JosΓ© dos Campos
π§π·
--Β°C
--Β°C
--Β°C
--Β°C
--%
--
Campinas
π§π·
--Β°C
--Β°C
--Β°C
--Β°C
--%
--
New York
πΊπΈ
--Β°C
--Β°C
--Β°C
--Β°C
--%
--
Los Angeles
πΊπΈ
--Β°C
--Β°C
--Β°C
--Β°C
--%
--
London
π¬π§
--Β°C
--Β°C
--Β°C
--Β°C
--%
--
Berlin
π©πͺ
--Β°C
--Β°C
--Β°C
--Β°C
--%
--
Oslo
π³π΄
--Β°C
--Β°C
--Β°C
--Β°C
--%
--
Tokyo
π―π΅
--Β°C
--Β°C
--Β°C
--Β°C
--%
--
Shanghai
π¨π³
--Β°C
--Β°C
--Β°C
--Β°C
--%
--
Chongqing
π¨π³
--Β°C
--Β°C
--Β°C
--Β°C
--%
--
Nanjing
π¨π³
--Β°C
--Β°C
--Β°C
--Β°C
--%
--
Dubai
π¦πͺ
--Β°C
--Β°C
--Β°C
--Β°C
--%
--
Model Performance (Last 7 Days)
Metric
Rain Prediction
Condition
Temp 24h
Temp 48h
Temp 72h
Accuracy/RMSE
--%
--%
--Β°C
--Β°C
--Β°C
vs Baseline
--
--
--
--
--
Complete ML Pipeline : Data collection β Preprocessing β Feature Engineering β Model Training β Evaluation β Monitoring
Multiple ML Libraries : Side-by-side comparison of linfa, smartcore, rustyml, and Burn
10 Years of Data : Historical weather data from 2016-2025 for 14 cities worldwide
4 Prediction Tasks :
Rain prediction (binary classification)
Weather condition classification (6 classes)
Temperature forecasting (24h, 48h, 72h)
Multi-target forecasting (temp + humidity + wind)
Drift Detection : Automated monitoring for data and concept drift
Live Dashboard : Daily auto-updated predictions via GitHub Actions
API : Open-Meteo (free, unlimited, no API key required)
Historical Data : 2016-2025 (10 years)
Granularity : Hourly observations
Total Records : ~1.2 million
Region
Cities
Brazil π§π·
SΓ£o Paulo, Rio de Janeiro, SΓ£o JosΓ© dos Campos, Campinas
USA πΊπΈ
New York, Los Angeles
Europe π¬π§π©πͺπ³π΄
London, Berlin, Oslo
Asia π―π΅π¨π³π¦πͺ
Tokyo, Shanghai, Chongqing, Nanjing, Dubai
ποΈ Project Structure
RustForMachineLearning/
βββ notebooks/ # Jupyter notebooks (Evcxr)
β βββ 01_data_collection_and_exploration.ipynb
β βββ 02_preprocessing_and_feature_engineering.ipynb
β βββ 03_feature_selection_and_model_training.ipynb
β βββ 04_hyperparameter_tuning.ipynb
β βββ 05_evaluation_and_validation.ipynb
β βββ 06_drift_detection_and_monitoring.ipynb
βββ src/ # Production Rust code
β βββ lib.rs
β βββ data/ # Data loading & API client
β βββ preprocessing/ # Data cleaning & transformation
β βββ features/ # Feature engineering & selection
β βββ models/ # ML model implementations
β βββ training/ # Training utilities
β βββ evaluation/ # Metrics & visualization
β βββ monitoring/ # Drift detection
β βββ bin/ # CLI tools
βββ data/ # Data storage
β βββ raw/ # Raw API data
β βββ processed/ # Cleaned data
β βββ features/ # Engineered features
βββ models/ # Trained model artifacts
βββ docs/ # Documentation
βββ .github/workflows/ # GitHub Actions
Rust (1.70+)
Jupyter with Evcxr kernel
Git
# Clone the repository
git clone https://github.com/yourusername/RustForMachineLearning.git
cd RustForMachineLearning
# Build the project
cargo build --release
# Install Evcxr Jupyter kernel (if not already installed)
cargo install evcxr_jupyter
evcxr_jupyter --install
# Start Jupyter
jupyter lab
# Navigate to notebooks/ and open the notebooks in order
Running Daily Predictions
cargo run --release --bin daily_predictions
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β DATA βββββΆβ PREPROC βββββΆβ FEATURES β
β COLLECTION β β & WRANGLING β β ENGINEERING β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β β β
βΌ βΌ βΌ
Open-Meteo API Missing values Lag features
14 cities Outliers Rolling stats
10 years Normalization Cyclical encoding
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β FEATURE βββββΆβ MODEL βββββΆβ TRAINING β
β SELECTION β β COMPARISON β β β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β β β
βΌ βΌ βΌ
Correlation linfa Train/Val/Test
Importance smartcore Cross-validation
Recursive elim Burn Early stopping
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β TUNING βββββΆβ EVALUATION βββββΆβ MONITORING β
β β β β β & DRIFT β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β β β
βΌ βΌ βΌ
Grid search Accuracy/F1 Data drift
Random search RMSE/MAE Concept drift
Cross-val ROC/AUC Auto-retrain
Model
Rain (Acc)
Rain (F1)
Condition (Acc)
Condition (F1)
Logistic Regression
--
--
--
--
Decision Tree
--
--
--
--
Random Forest
--
--
--
--
Gradient Boosting
--
--
--
--
Neural Network
--
--
--
--
Ensemble
--
--
--
--
Model
Temp 24h (RMSE)
Temp 48h (RMSE)
Temp 72h (RMSE)
Linear Regression
--
--
--
Decision Tree
--
--
--
Random Forest
--
--
--
Gradient Boosting
--
--
--
Neural Network
--
--
--
Ensemble
--
--
--
polars : DataFrame operations
ndarray : N-dimensional arrays
reqwest : HTTP client
chrono : Date/time handling
plotters : Visualization
statrs : Statistical functions
This project is licensed under the MIT License - see the LICENSE file for details.