-
Notifications
You must be signed in to change notification settings - Fork 22
big progress toward #76 #179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| x = f(t) | ||
| dxdt = df(t) | ||
| _t = dt | ||
| x = f(t) if diff_method not in [spline_irreg_step, rbfdiff, rtsdiff] else f(t_irreg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided I like shorter code more than I like efficiency. I find ternaries just more readable, because the primary thing I'm doing is assigning a few things.
| [5.507,-0.31714122], | ||
| [7.211,-0.59708324], | ||
| [8.612,-1.7118216 ]]) | ||
| assert np.allclose(maxtab, [[0.475, 1.58696894], # these numbers validated by eye with --plot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not ideal that when I change this file, these numbers can change, just because I call on np.random in a test that happens to be alphabetically prior and runs sooner. But meh, I just copy pasted the numbers because I stubbornly don't want to have to set the random seed multiple times.
|
|
||
|
|
||
| def integrate_dxdt_hat(dxdt_hat, dt_or_t): | ||
| def integrate_dxdt_hat(dxdt_hat, dt_or_t, axis=0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is now another argument, but it defaults to something that will work in the 1D case so all existing caller code can be agnostic to it.
| """ | ||
| return cumulative_trapezoid(dxdt_hat, initial=0)*dt_or_t if np.isscalar(dt_or_t) \ | ||
| else cumulative_trapezoid(dxdt_hat, x=dt_or_t, initial=0) | ||
| return cumulative_trapezoid(dxdt_hat, initial=0, axis=axis)*dt_or_t if np.isscalar(dt_or_t) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scipy is kind enough to give us an axis as well. How courteous of them. We should be courteous too. Hence #76 and this PR.
| # and only in the absence of noise, since the other test covers that. Instead, because multidimensional | ||
| # derivatives can be combined in interesting fashions, we find d^2 / dt_1 dt_2 and the Laplacian, | ||
| # d^2/dt_1^2 + d^2/dt_2^2. Tuples are again (L2,Linf) distances. | ||
| multidim_error_bounds = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding more methods here will be very easy.
| sigma = median_abs_deviation(x - x_hat, scale='normal') # M is in units of this robust scatter metric | ||
| if M == float('inf') or sigma < 1e-3: # If no scatter, then no outliers, so use L2 | ||
| return np.mean(x - x_hat) # Solves the l2 distance minimization, argmin_{x0} ||x_hat + x0 - x||_2^2 | ||
| s = list(x_hat.shape); s[axis] = 1; s = tuple(s) # proper shape for multidimensional integration constants |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tuple of slices is the magic that allows this thing to still be compact.
pynumdiff/utils/utility.py
Outdated
| if M == float('inf') or sigma < 1e-3: # If no scatter, then no outliers, so use L2 | ||
| return np.mean(x - x_hat) # Solves the l2 distance minimization, argmin_{x0} ||x_hat + x0 - x||_2^2 | ||
| s = list(x_hat.shape); s[axis] = 1; s = tuple(s) # proper shape for multidimensional integration constants | ||
| sigma = median_abs_deviation(x - x_hat, axis=axis, scale='normal') # M is in units of this robust scatter metric |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again they give us an axis parameter. So nice.
| s = list(x_hat.shape); s[axis] = 1; s = tuple(s) # proper shape for multidimensional integration constants | ||
| sigma = median_abs_deviation(x - x_hat, axis=axis, scale='normal') # M is in units of this robust scatter metric | ||
| if M == float('inf') or np.all(sigma < 1e-3): # If no scatter, then no outliers, so use L2 | ||
| return np.mean(x - x_hat, axis=axis).reshape(s) # Solves the l2 distance minimization, argmin_c ||x_hat + c - x||_2^2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Importantly, the return value has to be reshaped to s so the result can be +=ed to the multidimensional x_hat along the correct dimension.
| else: | ||
| return minimize(lambda x0: np.sum(huber(x - (x_hat+x0), M*sigma)), # fn to minimize in 1st argument | ||
| 0, method='SLSQP').x[0] # result is a vector, even if initial guess is just a scalar | ||
| return minimize(lambda c: np.sum(huber(x_hat + c.reshape(s) - x, M*sigma)), # fn to minimize in 1st argument |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This call is a little odd, because the input to the function being minimizeed has to be a vector, so we end up having to reshape c a bunch. Thankfully, reshapes are essentially free, and huber works naturally with multidimensional data, so long as sigma is also reshaped appropriately.
| if num_iterations < 1: raise ValueError("num_iterations must be >0") | ||
| if order not in [1, 2, 4]: raise ValueError("order must be 1, 2, or 4") | ||
|
|
||
| x = np.moveaxis(x, axis, 0) # move the axis of interest to the front to simplify differencing indexing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I toyed with making special tuples of slices to index here, but it's just easier to assume we're differentiating along the first dimension. Then essentially nothing in the complicated indexing needs to change. Overhead to move an axis is negligible, because it returns a view on the same data, so nothing gets copied.
| for i in range(num_iterations-1): | ||
| if order == 1: | ||
| dxdt_hat[:-1] = np.diff(x_hat) | ||
| dxdt_hat[:-1] = np.diff(x_hat, axis=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
diff by default wants to work along the last axis, not the first, so we now have to specify.
| dxdt_hat[-1] = x_hat[-1] - x_hat[-2] # use first-order endpoint formulas so as not to amplify noise. See #104 | ||
|
|
||
| x_hat = utility.integrate_dxdt_hat(dxdt_hat, 1) # estimate new x_hat by integrating derivative | ||
| x_hat = utility.integrate_dxdt_hat(dxdt_hat, 1, axis=0) # estimate new x_hat by integrating derivative |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where I realized a couple of the utilities would need to be extended too. Bit of thinking. Required care.
finitediffcan now do multidimensional data in the kind of low-level, efficient way I wanted. It required enhancements to some utilities code, which then required updating some tests. I've also created a new test of the overall method as well as a notebook to demo that it's doing the right thing. I renamed the notebooks file, since it has contained experiments as well as examples for a while. Since the number of notebooks has grown, that folder now has its own readme to help guide users.