For this project, we use 10 years of hourly historical weather data in Europe on a rectengular 10x10 grid. Data processing is used to interpolate missing values from nearby stations. The following map shows all data point coordinates along with a contour plot of the historic temperature data, showing a day in June 2012:
The complete set of inputs to the model comprises
- Temperature
- Dew point
- Relative humidity
- Wind speed
- Wind direction
- Ground level pressure
For 10 years and a sliding window length of 144 hours, we obtain a dataset with roughly
The model receives the last 72 hours of the weather of all stations and makes a forecast of the future 72 hours on the whole map. The neural network layout is defined as follows:
- a time distributed spatial encoder consisting of locally connected and max pooling layers
- a LSTM recurrent layer for the temporal dynamics
- an decoder consisting of a dense and a deconvolution plus a locally connected layer
def build_model( grid_size, channels, features, past, future ):
""" Creates tensorflow model
Params
======
grid_size: length of the 2D input grid
channels: number of input data channels
features: features represented in the output of the model
past: hours of data
future: hours of forecast
"""
grid_past = layers.Input((past,grid_size,grid_size,channels), name="grid_past")
grid_now = layers.Input((grid_size,grid_size,channels), name="grid_now")
# encoder for the grid state:
grid_encoder = layers.TimeDistributed( layers.LocallyConnected2D( channels, (3,3) ), name="local2D_1" )( grid_past )
grid_encoder = layers.TimeDistributed( layers.MaxPooling2D(), name="max_pooling_1" )( grid_encoder )
grid_encoder = layers.TimeDistributed( layers.LocallyConnected2D( channels, (3,3) ), name="local2D_2" )( grid_encoder )
grid_encoder = layers.TimeDistributed( layers.MaxPooling2D(), name="max_pooling_2" )( grid_encoder )
grid_encoder = layers.TimeDistributed( layers.Flatten() )( grid_encoder )
# recurrent network for the temporal dynamics:
time_encoder = layers.LSTM( 64, name="recurrent", return_sequences=True )( grid_encoder )
time_encoder = layers.TimeDistributed( layers.Dense( 32 ) )( time_encoder )
time_encoder = layers.Flatten()( time_encoder )
# merge with the grid state of current weather state:
merge_layer_grid = layers.Concatenate(name="concat")( [ time_encoder, layers.Flatten()( grid_now ) ] )
# decoder:
output_grid = layers.Dense( future*(grid_size-2)*(grid_size-2)*features )( merge_layer_grid )
output_grid = layers.Reshape( (future,(grid_size-2),(grid_size-2),features) )( output_grid )
output_grid = layers.TimeDistributed( tf.keras.layers.Conv2DTranspose( features, 4, input_shape=(8,8,1) ), name="upscaler" )( output_grid )
output_grid = layers.TimeDistributed( layers.LocallyConnected2D( features, (2,2) ), name="local_out" )( output_grid )
return keras.Model( [grid_past,grid_now], [output_grid] )
Apart from the definition of the NN-model we define additional physics loss functions reminicent of heat conduction
and the Euler equation without pressure gradient term
to train the model along with the mean squared error.
Since there are a inhomogeneous constants present in both of these losses,
we apply the differential operators on both predicted
We observe a good agreement between the forecast an actual future weather data on all of the spatial regions.
Thereby, the model captures diverse different weather trends over several days.

The complete map ouptut of the network also shows a good agreement capturing regional developments over several days.
Deviations are mostly expressed in absolute values. Drawn on the world map, we obtain the video shown above.

To run the code provided in this repository, you must have python 3.6 or higher installed. In addition, you will need to have the packages:
- a running jupyter notebook server is required.
- meteostat python api to retrieve weather data.
- pandas, numpy and scipy for data processing.
- tensorflow for machine learning.
- matplotlib, cartopy and pillow for visualizations.

