Skip to content

Latest commit

 

History

History
20 lines (14 loc) · 2.31 KB

File metadata and controls

20 lines (14 loc) · 2.31 KB

ml_baselines

A machine learning library for the estimation of greenhouse gas baseline timeseries from high-frequency observations.

Running the Code

Variable Setup

To run the code, the required dataset must first be created using baseline_setup.py. This collects the relevant meteorology, concentration data and baseline flags. The meteorological data were taken from the EMCWF ERA5 reanalyses, and the concentration from AGAGE.

This step can be ran for all sites and compounds by running setup_all.py.

Model Training

The models are trained using the dataset described above. There is a model per site per algorithm (neural network MLP and random forest). The final models are saved and can be found in the final models folder. Summary statistics are also available.

This step can be ran for all models by running train_all.py.

Model Evaluation

The models are tested through quantitative and qualitative evaluation. A chosen subsample of trace species were evaluated, as defined in the configuration file. The model outcomes for this species subsample are saved for each site (e.g. neural network results at Mace Head, Ireland).

This step can be ran for all sites and compounds by running eval_all.py.