-
Notifications
You must be signed in to change notification settings - Fork 22
big progress toward #76 #179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| # Code Usage and Experiments | ||
|
|
||
| | Notebook | What's in it | | ||
| | --- | --- | | ||
| | [Basic Tutorial](https://github.com/florisvb/PyNumDiff/blob/master/notebooks/1_basic_tutorial.ipynb) | Demo to show invocations of all the major methods in this library on 1D data. | | ||
| | [Optimizing Hyperparameters](https://github.com/florisvb/PyNumDiff/blob/master/notebooks/2_optimizing_hyperparameters.ipynb) | All methods' answers are affected by choices of hyperparameters, which complicates differentiation if the true derivative is not known. Here we briefly cover metrics we'd like to optimize and show how to use our `optimize` function to find good hyperparameter choices. | | ||
| | [Automatic Method Suggestion](https://github.com/florisvb/PyNumDiff/blob/master/notebooks/3_automatic_method_suggestion.ipynb) | A short demo of how to allow `pynumdiff` to choose a differentiation method for your data. | | ||
| | [Performance Analysis](https://github.com/florisvb/PyNumDiff/blob/master/notebooks/4_performance_analyssi.ipynb) | Experiments to compare methods' accuracy and bias across simulations. | | ||
| | [Robustness to Outliers Demo](https://github.com/florisvb/PyNumDiff/blob/master/notebooks/5_robust_outliers_demo.ipynb) | This notebook shows a head-to-head of `RTSDiff`'s and `RobustDiff`'s minimum-RMSE performances on simulations with outliers, to illustrate the value of using a Huber loss in the Kalman MAP problem. | | ||
| | [Multidimensionality Demo](https://github.com/florisvb/PyNumDiff/blob/master/notebooks/6_multidimensionality_demo.ipynb) | Demonstration of differentating multidimensional data along particular axes. | |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -5,7 +5,7 @@ | |
| from warnings import warn | ||
|
|
||
|
|
||
| def finitediff(x, dt, num_iterations=1, order=2): | ||
| def finitediff(x, dt, num_iterations=1, order=2, axis=0): | ||
| """Perform iterated finite difference of a given order. This serves as the common backing function for | ||
| all other methods in this module. | ||
|
|
||
|
|
@@ -14,20 +14,22 @@ def finitediff(x, dt, num_iterations=1, order=2): | |
| :param int num_iterations: number of iterations. If >1, the derivative is integrated with trapezoidal | ||
| rule, that result is finite-differenced again, and the cycle is repeated num_iterations-1 times | ||
| :param int order: 1, 2, or 4, controls which finite differencing scheme to employ | ||
| :param int axis: data dimension along which differentiation is performed | ||
|
|
||
| :return: - **x_hat** (np.array) -- original x if :code:`num_iterations=1`, else smoothed x that yielded dxdt_hat | ||
| - **dxdt_hat** (np.array) -- estimated derivative of x | ||
| """ | ||
| if num_iterations < 1: raise ValueError("num_iterations must be >0") | ||
| if order not in [1, 2, 4]: raise ValueError("order must be 1, 2, or 4") | ||
|
|
||
| x = np.moveaxis(x, axis, 0) # move the axis of interest to the front to simplify differencing indexing | ||
| x_hat = np.asarray(x) # allows for array-like. Preserve reference to x, for finding the final constant of integration | ||
| dxdt_hat = np.zeros(x.shape) # preallocate reusable memory | ||
|
|
||
| # For all but the last iteration, do the differentate->integrate smoothing loop, being careful with endpoints | ||
| for i in range(num_iterations-1): | ||
| if order == 1: | ||
| dxdt_hat[:-1] = np.diff(x_hat) | ||
| dxdt_hat[:-1] = np.diff(x_hat, axis=0) | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| dxdt_hat[-1] = dxdt_hat[-2] # using stencil -1,0 vs stencil 0,1 you get an expression for the same value | ||
| elif order == 2: | ||
| dxdt_hat[1:-1] = (x_hat[2:] - x_hat[:-2])/2 # second-order center-difference formula | ||
|
|
@@ -40,13 +42,13 @@ def finitediff(x, dt, num_iterations=1, order=2): | |
| dxdt_hat[0] = x_hat[1] - x_hat[0] | ||
| dxdt_hat[-1] = x_hat[-1] - x_hat[-2] # use first-order endpoint formulas so as not to amplify noise. See #104 | ||
|
|
||
| x_hat = utility.integrate_dxdt_hat(dxdt_hat, 1) # estimate new x_hat by integrating derivative | ||
| x_hat = utility.integrate_dxdt_hat(dxdt_hat, 1, axis=0) # estimate new x_hat by integrating derivative | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is where I realized a couple of the utilities would need to be extended too. Bit of thinking. Required care. |
||
| # We can skip dividing by dt here and pass dt=1, because the integration multiplies dt back in. | ||
| # No need to find integration constant until the very end, because we just differentiate again. | ||
| # Note that I also tried integrating with Simpson's rule here, and it seems to do worse. See #104 | ||
|
|
||
| if order == 1: | ||
| dxdt_hat[:-1] = np.diff(x_hat) | ||
| dxdt_hat[:-1] = np.diff(x_hat, axis=0) | ||
| dxdt_hat[-1] = dxdt_hat[-2] # using stencil -1,0 vs stencil 0,1 you get an expression for the same value | ||
| elif order == 2: | ||
| dxdt_hat[1:-1] = x_hat[2:] - x_hat[:-2] # second-order center-difference formula | ||
|
|
@@ -63,9 +65,9 @@ def finitediff(x, dt, num_iterations=1, order=2): | |
| dxdt_hat /= dt # don't forget to scale by dt, can't skip it this time | ||
|
|
||
| if num_iterations > 1: # We've lost a constant of integration in the above | ||
| x_hat += utility.estimate_integration_constant(x, x_hat) | ||
| x_hat += utility.estimate_integration_constant(x, x_hat, axis=0) | ||
|
|
||
| return x_hat, dxdt_hat | ||
| return np.moveaxis(x_hat, 0, axis), np.moveaxis(dxdt_hat, 0, axis) # reorder axes back to original | ||
|
|
||
|
|
||
| def first_order(x, dt, params=None, options={}, num_iterations=1): | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2,7 +2,7 @@ | |
| from pytest import mark | ||
| from warnings import warn | ||
|
|
||
| from ..finite_difference import first_order, second_order, fourth_order | ||
| from ..finite_difference import finitediff, first_order, second_order, fourth_order | ||
| from ..linear_model import lineardiff | ||
| from ..basis_fit import spectraldiff, rbfdiff | ||
| from ..polynomial_fit import polydiff, savgoldiff, splinediff | ||
|
|
@@ -235,25 +235,16 @@ def spline_irreg_step(*args, **kwargs): return splinediff(*args, **kwargs) | |
| @mark.parametrize("diff_method_and_params", diff_methods_and_params) # things like splinediff, with their parameters | ||
| @mark.parametrize("test_func_and_deriv", test_funcs_and_derivs) # analytic functions, with their true derivatives | ||
| def test_diff_method(diff_method_and_params, test_func_and_deriv, request): # request gives access to context | ||
| """Ensure differentiation methods find accurate derivatives""" | ||
| # unpack | ||
| diff_method, params = diff_method_and_params[:2] | ||
| if len(diff_method_and_params) == 3: options = diff_method_and_params[2] # optionally pass old-style `options` dict | ||
| i, latex_name, f, df = test_func_and_deriv | ||
|
|
||
| # some methods rely on cvxpy, and we'd like to allow use of pynumdiff without convex optimization | ||
| if diff_method in [lineardiff, velocity, acceleration, jerk, smooth_acceleration, robustdiff]: | ||
| try: import cvxpy | ||
| except: warn(f"Cannot import cvxpy, skipping {diff_method} test."); return | ||
|
|
||
| # sample the true function and true derivative, and make noisy samples | ||
| if diff_method in [spline_irreg_step, rbfdiff, rtsdiff]: # list that can handle variable dt | ||
| x = f(t_irreg) | ||
| dxdt = df(t_irreg) | ||
| _t = t_irreg | ||
| else: | ||
| x = f(t) | ||
| dxdt = df(t) | ||
| _t = dt | ||
| x = f(t) if diff_method not in [spline_irreg_step, rbfdiff, rtsdiff] else f(t_irreg) | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I decided I like shorter code more than I like efficiency. I find ternaries just more readable, because the primary thing I'm doing is assigning a few things. |
||
| dxdt = df(t) if diff_method not in [spline_irreg_step, rbfdiff, rtsdiff] else df(t_irreg) | ||
| _t = dt if diff_method not in [spline_irreg_step, rbfdiff, rtsdiff] else t_irreg | ||
| x_noisy = x + noise | ||
|
|
||
| # differentiate without and with noise, accounting for new and old styles of calling functions | ||
|
|
@@ -305,3 +296,69 @@ def test_diff_method(diff_method_and_params, test_func_and_deriv, request): # re | |
| # methods that get super duper close can converge to different very small limits on different runs | ||
| if 1e-18 < l2_error < 10**(log_l2_bound - 1) or 1e-18 < linf_error < 10**(log_linf_bound - 1): | ||
| print(f"Improvement detected for method {diff_method.__name__}") | ||
|
|
||
|
|
||
| T1, T2 = np.meshgrid(np.linspace(-1, 1, 101), np.linspace(-1, 1, 101)) # a 101 x 101 grid | ||
| dt2 = 0.02 # distance between samples in the 2D T grids | ||
| x = T1**2 * np.sin(3/2 * np.pi * T2) # 2D function | ||
|
|
||
| # When one day all or most methods support multidimensionality, and the legacy way of calling methods is | ||
| # gone, diff_methods_and_params can be used for the multidimensionality test as well | ||
| multidim_methods_and_params = [(finitediff, {})] | ||
|
|
||
| # Similar to the error_bounds table, index by method first. But then we test against only one 2D function, | ||
| # and only in the absence of noise, since the other test covers that. Instead, because multidimensional | ||
| # derivatives can be combined in interesting fashions, we find d^2 / dt_1 dt_2 and the Laplacian, | ||
| # d^2/dt_1^2 + d^2/dt_2^2. Tuples are again (L2,Linf) distances. | ||
| multidim_error_bounds = { | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Adding more methods here will be very easy. |
||
| finitediff: [(0, -1), (1, -1)] | ||
| } | ||
|
|
||
| @mark.parametrize("multidim_method_and_params", multidim_methods_and_params) | ||
| def test_multidimensionality(multidim_method_and_params, request): | ||
| """Ensure methods with an axis parameter can successfully differentiate in independent directions""" | ||
| diff_method, params = multidim_method_and_params | ||
|
|
||
| # d^2 / dt_1 dt_2 | ||
| analytic_d2 = 3 * T1 * np.pi * np.cos(3/2 * np.pi * T2) | ||
| dxdt1 = diff_method(x, dt2, **params, axis=0)[1] | ||
| computed_d2 = diff_method(dxdt1, dt2, **params, axis=1)[1] | ||
| l2_error_d2 = np.linalg.norm(analytic_d2 - computed_d2) # Frobenius norm (2 norm of vectorized array) | ||
| linf_error_d2 = np.max(np.abs(analytic_d2 - computed_d2)) | ||
|
|
||
| # Laplacian | ||
| analytic_laplacian = 2 * np.sin(3/2 * np.pi * T2) - 9/4 * np.pi**2 * T1**2 * np.sin(3/2 * np.pi * T2) | ||
| dxdt2 = diff_method(x, dt2, **params, axis=1)[1] | ||
| computed_laplacian = diff_method(dxdt1, dt2, **params, axis=0)[1] + diff_method(dxdt2, dt2, **params, axis=1)[1] | ||
| l2_error_lap = np.linalg.norm(analytic_laplacian - computed_laplacian) | ||
| linf_error_lap = np.max(np.abs(analytic_laplacian - computed_laplacian)) | ||
|
|
||
| if request.config.getoption("--bounds"): | ||
| print([(int(np.ceil(np.log10(l2_error_d2))), int(np.ceil(np.log10(linf_error_d2)))), (int(np.ceil(np.log10(l2_error_lap))), int(np.ceil(np.log10(linf_error_lap))))]) | ||
| else: | ||
| (log_l2_bound_d2, log_linf_bound_d2), (log_l2_bound_lap, log_linf_bound_lap) = multidim_error_bounds[diff_method] | ||
| assert l2_error_d2 < 10**log_l2_bound_d2 | ||
| assert linf_error_d2 < 10**log_linf_bound_d2 | ||
|
|
||
| if request.config.getoption("--plot"): | ||
| from matplotlib import pyplot | ||
| fig = pyplot.figure(figsize=(12, 5), constrained_layout=True) | ||
| ax1 = fig.add_subplot(1, 3, 1, projection='3d') | ||
| ax1.plot_surface(T1, T2, x, cmap='viridis', alpha=0.5) | ||
| ax1.set_title(r'original function, $x$') | ||
| ax1.set_xlabel(r'$t_1$') | ||
| ax1.set_ylabel(r'$t_2$') | ||
| ax2 = fig.add_subplot(1, 3, 2, projection='3d') | ||
| ax2.plot_surface(T1, T2, analytic_d2, cmap='viridis', alpha=0.5) | ||
| ax2.set_title(r'$\frac{\partial^2 x}{\partial t_1 \partial t_2}$') | ||
| ax2.set_xlabel(r'$t_1$') | ||
| ax2.set_ylabel(r'$t_2$') | ||
| ax3 = fig.add_subplot(1, 3, 3, projection='3d') | ||
| surf = ax3.plot_surface(T1, T2, analytic_laplacian, cmap='viridis', alpha=0.5, label='analytic') | ||
| ax3.set_title(r'$\frac{\partial^2}{\partial t_1^2} + \frac{\partial^2}{\partial t_2^2}$') | ||
| ax3.set_xlabel(r'$t_1$') | ||
| ax3.set_ylabel(r'$t_2$') | ||
|
|
||
| ax2.plot_wireframe(T1, T2, computed_d2) | ||
| ax3.plot_wireframe(T1, T2, computed_laplacian, label='computed') | ||
| legend = ax3.legend(bbox_to_anchor=(0.7, 0.8)); legend.legend_handles[0].set_facecolor(pyplot.cm.viridis(0.6)) | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -11,19 +11,24 @@ def test_integrate_dxdt_hat(): | |
| dt = 0.1 | ||
| for dxdt,expected in [(np.ones(10), np.arange(0, 1, dt)), # constant derivative | ||
| (np.linspace(0, 1, 11), [0, 0.005, 0.02, 0.045, 0.08, 0.125, 0.18, 0.245, 0.32, 0.405, 0.5]), # linear derivative | ||
| (np.array([1.0]), [0])]: # edge case of just one entry | ||
| x_hat = utility.integrate_dxdt_hat(dxdt, dt) | ||
| assert np.allclose(x_hat, expected) | ||
| (np.array([1.0]), [0]), # edge case of just one entry | ||
| (np.ones((5,5)), dt*np.vstack([np.arange(5)]*5))]: # multidimensional case | ||
| x_hat = utility.integrate_dxdt_hat(dxdt, dt, axis=-1) | ||
| assert len(x_hat) == len(dxdt) | ||
| assert np.allclose(x_hat, expected) | ||
| x_hat = utility.integrate_dxdt_hat([0, 2, 3, 5], [0, 2, 3, 5]) # y = x at irregular spacing | ||
| assert np.allclose(x_hat, [0, 2, 4.5, 12.5]) | ||
|
|
||
|
|
||
| def test_estimate_integration_constant(): | ||
| """For known simple functions, make sure the initial condition is as expected""" | ||
| for x,x_hat,c in [(np.array([1.0, 2.0, 3.0, 4.0, 5.0]), np.array([0.0, 1.0, 2.0, 3.0, 4.0]), 1), # Perfect alignment case, xhat shifted by 1 | ||
| (np.ones(5)*10, np.ones(5)*5, 5), | ||
| (np.array([0]), np.array([1]), -1)]: | ||
| x0 = utility.estimate_integration_constant(x, x_hat) | ||
| assert np.allclose(x0, float(c), rtol=1e-3) | ||
| for x,x_hat,c in [(np.array([1.0, 2.0, 3.0, 4.0, 5.0]), np.array([0.0, 1.0, 2.0, 3.0, 4.0]), 1), # pure offset | ||
| (np.ones(5)*10, np.ones(5)*5 + 0.01*np.random.randn(5), 5), # with some noise | ||
| (np.array([0]), np.array([1]), -1), # singleton case | ||
| (np.vstack([np.arange(5)]*5), np.vstack([np.arange(5) + c for c in range(5)]), -np.arange(5).reshape(-1,1)), # multidimensional case | ||
| (np.ones((7,5)), np.vstack([np.arange(5) + c for c in range(7)]), -np.arange(1,8).reshape(-1,1))]: # nonsquare case | ||
| x0 = utility.estimate_integration_constant(x, x_hat, axis=-1) | ||
| assert np.allclose(x0, c, rtol=1e-3) | ||
|
|
||
| x_hat = np.sin(np.arange(400)*0.01) | ||
| x = x_hat + np.random.normal(0, 0.1, 400) + 1 # shift data by 1 | ||
|
|
@@ -61,19 +66,19 @@ def test_peakdet(request): | |
| pyplot.title('peakdet validataion') | ||
| pyplot.show() | ||
|
|
||
| assert np.allclose(maxtab, [[0.447, 1.58575613], # these numbers validated by eye with --plot | ||
| [1.818, 1.91349239], | ||
| [3.316,-0.02740252], | ||
| [4.976, 0.74512778], | ||
| [6.338, 1.89861691], | ||
| [7.765, 0.57577842], | ||
| [9.402, 0.59450898]]) | ||
| assert np.allclose(mintab, [[1.139, 0.31325728], | ||
| [2.752,-1.12769567], | ||
| [4.098,-2.00326946], | ||
| [5.507,-0.31714122], | ||
| [7.211,-0.59708324], | ||
| [8.612,-1.7118216 ]]) | ||
| assert np.allclose(maxtab, [[0.475, 1.58696894], # these numbers validated by eye with --plot | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not ideal that when I change this file, these numbers can change, just because I call on |
||
| [1.813, 1.91418201], | ||
| [3.311, -0.02749755], | ||
| [4.971, 0.74687989], | ||
| [6.333, 1.89776084], | ||
| [7.76, 0.57366611], | ||
| [9.397, 0.59379866]]) | ||
| assert np.allclose(mintab, [[1.134, 0.31086976], | ||
| [2.747, -1.13032479], | ||
| [4.093, -2.00466846], | ||
| [5.502, -0.31428495], | ||
| [7.206, -0.5993835], | ||
| [8.607,-1.71266074]]) | ||
|
|
||
| def test_slide_function(): | ||
| """Verify the slide function's weighting scheme calculates as expected""" | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I toyed with making special tuples of slices to index here, but it's just easier to assume we're differentiating along the first dimension. Then essentially nothing in the complicated indexing needs to change. Overhead to move an axis is negligible, because it returns a view on the same data, so nothing gets copied.