OpenDA-Association · nilsvanvelzen · Nov 29, 2018 · Nov 29, 2018 · Dec 21, 2018 · nilsvanvelzen
diff --git a/py_openda/README_ENKF.md b/py_openda/README_ENKF.md
@@ -0,0 +1,182 @@
+## Using the main program
+The main program can be found at `enkf_swan.py`. This program can run an ensemble kalman filter simulation based on a `.oda` file along with its configuration `.xml` files. This is done by calling the function `main(input_string, observation_location=0)`, which uses the absolute file path given by `input_string` for configuration and plots the results at the location given by `observation_location`. The function also produces a tuple containing two numpy arrays with results; one from a simulation with ensemble kalman filtering and one without any data assimilation. The results are taken between the prediction step and the update step.
+
+## Inserting your own model or observer class
+If you want to use a different model factory, stochastic observer or time class for the simulation, you have to change the relevant import statement in `GenericEnsembleKalmanFilter.py` to insert the class into the algorithm. The algorithm should have no problems with using a different class, assuming that your class has the same methods as the default one and assuming it is able to communicate well enough with the other classes that you are planning on using for the algorithm. The data type of the output of some of the methods that produce vectors are generic, meaning that as long they are either a Python list, a 1D numpy array, a java array or an OpenDA IVector, the `GenericEnsembleKalmanFilter` will convert the output to something usable. If your output uses a different data type, you must convert it yourself to one of those four options. This can either be done premptively at the end of your method, or you can include the data type in the `input_to_*` functions in `py4j_utils.py`. If your data type is a java class, be sure to use the IVector for an example on how to handle class recognition with py4j. For more information on the requirements of the methods, be sure to read the documentation of the default objects found in `JObjects.py`. 
+
+## The Time object
+TODO: Time Object kan echt 3 keer beter.
+
+## Using your own model factory/instance
+
+When implementing a model factory, you just need a class with the following two methods:
+```python
+def __init__(self, config, scriptdir):
+    """
+    :param config: dictionary used for configuration.
+    :param scriptdir: location of the main .oda file.
+    """
+
+def get_instance(self, noise_config, main_or_ens):
+    """
+    Create an instance of the stochastic Model.
+
+    :param noise_config: dictionary as given by EnkfAlgorithm.xml for
+    the noise configuration.
+    :param main_or_ens: determines the ouput level of the model.
+    :return: the stochastic Model instance.
+    """
+```
+
+* `__init__(self, config, scriptdir)` initializes the object. Here `config` is the stochModelFactory element of the main `.oda` configuration file as a dictionary. `scriptdir` is the directory containing the `.oda` file.
+
+TODO: main_or_ens even uitvogelen.
+
+* `get_instance(self, noise_config, main_or_ens)` returns an instance of the coresponding model. `nosie_config` is a python dictionary containing model noise configuration. It has the keys `@stochInit`, `@stochParameter` and `@stochForcing`. Note that this is the only point in the algorithm where you will call the `__init__` method of your model instance, so you are free to give it whatever input you need.
+
+A model instance requires the following methods:
+
+```python
+def get_time_horizon(self):
+    """
+    Get the computational time horizon of the model (begin and end time).
+
+    :return: the time horizon (containing begin and end time).
+    """
+
+def get_current_time(self):
+    """
+    Get the model instance's current simulation time stamp.
+
+    :return: The model's current simulation time stamp.
+    """
+
+def announce_observed_values(self, descriptions):
+    """
+    Tells model that it can expect to be asked for model values
+    corresponding to the observations
+    described. The model can make arrangement to save these values. The
+    method compute run over a long interval at once, 
+    not stopping at each time with observations.
+    This is meant to increase the performance especially of calibration
+    algorithms.
+
+    :param descriptions: an ObservationDescriptions object with meta
+    data for the observations.
+    :return:
+    """
+
+def compute(self, time):
+    """
+    Let the stochastic model instance compute to the requested target time stamp.
+    This function can not be used to go back in time.
+
+    :param time: Time to compute to.
+    :return:
+    """
+
+def get_observations(self, descriptions):
+    """
+    Get model values corresponding to the descriptions.
+
+    :param descriptions: An ObservationDescriptions object with meta
+    data for the observations
+    :return: python list with the model values corresponding to the descriptions
+    """
+
+def update_state(self, state_array, main_or_ens):
+    """
+    Update the state vector of the model.
+
+    :param state_array: numpy array used to update the model state.
+    :main_or_ens: "main" for updating the main model, "ens" for ensemble members.
+    :return:
+    """
+```
+
+* `get_time_horizon(self)` should return a time object spanning the start and end of the simulation.
+
+* `get_current_time(self)` returns the time up to which the state vector has been calculated.
+
+* `announce_observed_values(self, descriptions)` function which can be used to make sure that the model will save the observations described by the descriptions. This function does get called by the main algorithm, but it does not have to do anything if your implementation does not need it to.
+
+* `compute(self, time)` runs the model until the given time.
+
+* `get_observations(self, descriptions)` returns values of the state vector that correspond to the given observation descriptions. These descriptions are given by the stoch observer, so pay special attention when trying to use them.
+
+* `update_state(self, state_array, main_or_ens)` changes the state vector based on the input `state_array`. Note that the input is a little different depending on whether the model to be updated is an ensemble member or the main model. For an ensemblemember `state_array` should be added to the state vector, while for the main model it is supposed to replace the state vector. This is controlled by the `main_or_ens` argument, which is a string equal to either "main" or "ens".
+
+## Using your own stoch observer
+When creating your own stoch observer, the following methods are required:
+```python
+def __init__(self, config, scriptdir):
+    """
+    :param config: dictionary used for configuration.
+    :param scriptdir: location of the main .oda file.
+    """
+
+def create_selection(self, model_span):
+    """
+    Create a new observer containing a selection of the present observer
+    based on the given time span.
+
+    :param model_span: time span with selection.
+    :return: stochastic observer containing the required selection.
+    """
+
+def get_times(self):
+    """
+    Get all different times in increasing order. There is at least 
+    one observation for each time.
+
+    :return: some type of vector containing the times
+    """
+
+def get_count(self):
+    """
+    Total number of observations.
+
+    :return: the number of observations.
+    """
+
+def get_observation_descriptions(self):
+    """
+    Get the observation descriptions.
+
+    :return: observation descriptions which are compatible with the
+    used model instance
+    """
+
+def get_sqrt_covariance(self):
+    """
+    Get the covariance matrix for the stochastic observations.
+
+    :return: the covariance matrix as numpy array.
+    """
+
+def get_realizations(self):
+    """
+    Get realization values for all observations, for one ensemble member.
+
+    :return: the realizations.
+    """
+```
+
+TODO: Misschien leuk om init ook met alleen een xml file als input te doen?
+
+* `__init__(self, config=None, scriptdir=None)` initializes the object. Here `config` is the stochObserver element of the main `.oda` configuration file as a dictionary. `scriptdir` is the directory containing the `.oda` file.
+
+* `create_selection(self, model_span)` creates a new stoch observer, limited to the span given by the time object `model_span`. The class of the output should be the same as the class of `self`.
+
+* `get_times(self)` lists all the times where there are observations available. The output will automatically be converted to a python list of time objects.
+
+* `get_count(self)` simply returns an integer indicating the number of measurements available.
+
+TODO: java observation description maken!
+
+* `get_observation_descriptions(self)` returns the observation descriptions that the model instance can use to determine which elements of the state vector corresponds to quantities that are observed. Keep in mind that there is no utility routine for creating a java observation description from python, so for now you cannot combine a python observer with a java model.
+
+* `get_sqrt_covariance(self)` generates an np array representing the squared covariance matrix. Note that you first have to convert the output yourself. For this you can use the utility routine `j_vector_array_to_np_array(j_sqrt)` if you have a java vector array.
+
+* `get_realizations(self)` creates realizations of the observed values using the mean and the standarddeviation of the measurement. The output will be converted automatically.
+
diff --git a/py_openda/examples/py_kalman/README.txt b/py_openda/examples/py_kalman/README.txt
@@ -0,0 +1 @@
+Description of py_openda library goes here
diff --git a/py_openda/examples/py_kalman/enkf_swan.py b/py_openda/examples/py_kalman/enkf_swan.py
@@ -0,0 +1,104 @@
+#!/usr/bin/env python3
+# -*- coding: utf-8 -*-
+"""
+Algorithm for data assimilation using an ensemble kalman filter. Uses an
+observer and a model factory from Java but the algorithm resides in Python.
+To run the algorithm remember to first start the server : oda_py4j.sh
+
+Created on Thu Nov  8 15:57:29 2018
+
+@author: hegeman
+"""
+
+
+import os
+from time import time as tm
+import numpy as np
+import matplotlib.pyplot as plt
+import xmlschema
+from py_openda.costFunctions.GenericEnsembleKalmanFilter import GenericEnsembleKalmanFilter
+from py_openda.algorithms.ensemble_kalman import kalman_algorithm, no_filter
+
+#TODO: Some movable parts in Java might still be hard coded in Python
+#TODO: Quite a few xml files might not follow the schema as well as xmlschema wants them to
+
+#input_string = '/v3/Stage/Rick/openda/openda_public/course/exercise_lorenz_3var_part1/simulation_ensemble.oda'
+input_string = '/v3/Stage/Rick/openda/openda_public/examples/model_swan/kalman_twin_windbound/enkf_wind_bound.oda'
+#input_string = '/v3/Stage/Rick/openda/openda_public/course/exercise_double_pendulum_part2/enkf.oda'
+#input_string = '/v3/Stage/Rick/openda/openda_public/core/tests/simple_oscillator/Enkf.oda'
+
+def main(input_string, observation_location=0):
+    """
+    Main function that runs an ensemble kalman filter as described by the .oda file in input_string.
+    yields the pred_f_central for a filtered and an unfiltered simulation and plots the
+    results at one observation location.
+
+    :param input_string: absolute file path of the .oda file.
+    :param observation_location: index of the observation location you want to plot (default 0).
+
+    :return results: numpy array containing the results for the filtered experiment.
+    :return no_results: numpy array containing the results for the unfiltered experiment.
+    """
+    #TODO: Create a py4j JavaGateway and make it a global variable using __builtin__
+    #This way you can use a callback server for multiple objects.
+
+    os.chdir(input_string.rsplit('/', 1)[0])
+
+    scriptdir = os.getcwd()
+
+    main_schema = xmlschema.XMLSchema('http://schemas.openda.org/openDaApplication.xsd')
+    alg_schema = xmlschema.XMLSchema('http://schemas.openda.org/algorithm/enkf.xsd')
+#    alg_schema = xmlschema.XMLSchema('http://schemas.openda.org/algorithm/sequentialEnsembleAlgorithm.xsd')
+    main_config = main_schema.decode(input_string.rsplit('/', 1)[1])
+
+
+
+    alg_config = alg_schema.decode('%s/%s' % (main_config.get('algorithm').get('workingDirectory'),
+                                              main_config.get('algorithm').get('configString')))
+
+# ensembleModel@stochParameter=false
+# ensembleModel@stochForcing=true
+# ensembleModel@stochInit=true
+
+
+#    n_ensemble = 3
+    n_ensemble = alg_config.get('ensembleSize')
+
+
+    main_class = GenericEnsembleKalmanFilter(n_ensemble, alg_config, main_config, scriptdir)
+    n_obs = main_class.get_n_observations()
+
+#    n_steps = main_class.get_n_times()-3
+    n_steps = 3
+    t = main_class.get_timeline()[:n_steps+1]
+    t = [(time - t[0])*24 for time in t]
+    results = np.zeros((n_steps+1, n_obs))
+    for j in range(n_steps):
+        print(j)
+        start = tm()
+        results[j, :] = kalman_algorithm(main_class)
+        end = tm()
+        print(end-start)
+
+    results[-1, :] = no_filter(main_class)
+    plt.plot(t, results[:, observation_location])
+
+    compare_class = GenericEnsembleKalmanFilter(1, alg_config, main_config, scriptdir)
+    no_results = np.zeros((n_steps+1, n_obs))
+    for j in range(n_steps+1):
+        print(j)
+        start = tm()
+        no_results[j, :] = no_filter(compare_class)
+        end = tm()
+        print(end-start)
+
+    plt.plot(t, no_results[:, observation_location])
+    plt.legend(("EnKF", "no_filter"))
+    plt.ylabel("x_f_central")
+    plt.xlabel("t in hours")
+    return(results, no_results)
+
+if __name__ == "__main__":
+    strt = tm()
+    (results, no_results) = main(input_string)
+    print(tm()-strt)
diff --git a/py_openda/examples/py_kalman/setup.py b/py_openda/examples/py_kalman/setup.py
@@ -0,0 +1,10 @@
+from setuptools import setup, find_packages
+with open("README.txt", 'r') as f:
+    long_description = f.read()
+
+setup(name = 'py_openda',
+      version = '0.1',
+      packages = find_packages(),
+      description = ['...'],
+      long_description = long_description,
+      install_requires = ['py4j', 'numpy', 'scipy', 'pandas', 'matplotlib', 'xmlschema'])
diff --git a/py_openda/py_openda/__init__.py b/py_openda/py_openda/__init__.py
diff --git a/py_openda/py_openda/algorithms/__init__.py b/py_openda/py_openda/algorithms/__init__.py