Skip to content

Conversation

@pavelkomarov
Copy link
Collaborator

finitediff can now do multidimensional data in the kind of low-level, efficient way I wanted. It required enhancements to some utilities code, which then required updating some tests. I've also created a new test of the overall method as well as a notebook to demo that it's doing the right thing. I renamed the notebooks file, since it has contained experiments as well as examples for a while. Since the number of notebooks has grown, that folder now has its own readme to help guide users.

x = f(t)
dxdt = df(t)
_t = dt
x = f(t) if diff_method not in [spline_irreg_step, rbfdiff, rtsdiff] else f(t_irreg)
Copy link
Collaborator Author

@pavelkomarov pavelkomarov Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided I like shorter code more than I like efficiency. I find ternaries just more readable, because the primary thing I'm doing is assigning a few things.

[5.507,-0.31714122],
[7.211,-0.59708324],
[8.612,-1.7118216 ]])
assert np.allclose(maxtab, [[0.475, 1.58696894], # these numbers validated by eye with --plot
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not ideal that when I change this file, these numbers can change, just because I call on np.random in a test that happens to be alphabetically prior and runs sooner. But meh, I just copy pasted the numbers because I stubbornly don't want to have to set the random seed multiple times.



def integrate_dxdt_hat(dxdt_hat, dt_or_t):
def integrate_dxdt_hat(dxdt_hat, dt_or_t, axis=0):
Copy link
Collaborator Author

@pavelkomarov pavelkomarov Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is now another argument, but it defaults to something that will work in the 1D case so all existing caller code can be agnostic to it.

"""
return cumulative_trapezoid(dxdt_hat, initial=0)*dt_or_t if np.isscalar(dt_or_t) \
else cumulative_trapezoid(dxdt_hat, x=dt_or_t, initial=0)
return cumulative_trapezoid(dxdt_hat, initial=0, axis=axis)*dt_or_t if np.isscalar(dt_or_t) \
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scipy is kind enough to give us an axis as well. How courteous of them. We should be courteous too. Hence #76 and this PR.

# and only in the absence of noise, since the other test covers that. Instead, because multidimensional
# derivatives can be combined in interesting fashions, we find d^2 / dt_1 dt_2 and the Laplacian,
# d^2/dt_1^2 + d^2/dt_2^2. Tuples are again (L2,Linf) distances.
multidim_error_bounds = {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding more methods here will be very easy.

sigma = median_abs_deviation(x - x_hat, scale='normal') # M is in units of this robust scatter metric
if M == float('inf') or sigma < 1e-3: # If no scatter, then no outliers, so use L2
return np.mean(x - x_hat) # Solves the l2 distance minimization, argmin_{x0} ||x_hat + x0 - x||_2^2
s = list(x_hat.shape); s[axis] = 1; s = tuple(s) # proper shape for multidimensional integration constants
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tuple of slices is the magic that allows this thing to still be compact.

if M == float('inf') or sigma < 1e-3: # If no scatter, then no outliers, so use L2
return np.mean(x - x_hat) # Solves the l2 distance minimization, argmin_{x0} ||x_hat + x0 - x||_2^2
s = list(x_hat.shape); s[axis] = 1; s = tuple(s) # proper shape for multidimensional integration constants
sigma = median_abs_deviation(x - x_hat, axis=axis, scale='normal') # M is in units of this robust scatter metric
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again they give us an axis parameter. So nice.

s = list(x_hat.shape); s[axis] = 1; s = tuple(s) # proper shape for multidimensional integration constants
sigma = median_abs_deviation(x - x_hat, axis=axis, scale='normal') # M is in units of this robust scatter metric
if M == float('inf') or np.all(sigma < 1e-3): # If no scatter, then no outliers, so use L2
return np.mean(x - x_hat, axis=axis).reshape(s) # Solves the l2 distance minimization, argmin_c ||x_hat + c - x||_2^2
Copy link
Collaborator Author

@pavelkomarov pavelkomarov Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Importantly, the return value has to be reshaped to s so the result can be +=ed to the multidimensional x_hat along the correct dimension.

else:
return minimize(lambda x0: np.sum(huber(x - (x_hat+x0), M*sigma)), # fn to minimize in 1st argument
0, method='SLSQP').x[0] # result is a vector, even if initial guess is just a scalar
return minimize(lambda c: np.sum(huber(x_hat + c.reshape(s) - x, M*sigma)), # fn to minimize in 1st argument
Copy link
Collaborator Author

@pavelkomarov pavelkomarov Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call is a little odd, because the input to the function being minimizeed has to be a vector, so we end up having to reshape c a bunch. Thankfully, reshapes are essentially free, and huber works naturally with multidimensional data, so long as sigma is also reshaped appropriately.

if num_iterations < 1: raise ValueError("num_iterations must be >0")
if order not in [1, 2, 4]: raise ValueError("order must be 1, 2, or 4")

x = np.moveaxis(x, axis, 0) # move the axis of interest to the front to simplify differencing indexing
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I toyed with making special tuples of slices to index here, but it's just easier to assume we're differentiating along the first dimension. Then essentially nothing in the complicated indexing needs to change. Overhead to move an axis is negligible, because it returns a view on the same data, so nothing gets copied.

for i in range(num_iterations-1):
if order == 1:
dxdt_hat[:-1] = np.diff(x_hat)
dxdt_hat[:-1] = np.diff(x_hat, axis=0)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diff by default wants to work along the last axis, not the first, so we now have to specify.

dxdt_hat[-1] = x_hat[-1] - x_hat[-2] # use first-order endpoint formulas so as not to amplify noise. See #104

x_hat = utility.integrate_dxdt_hat(dxdt_hat, 1) # estimate new x_hat by integrating derivative
x_hat = utility.integrate_dxdt_hat(dxdt_hat, 1, axis=0) # estimate new x_hat by integrating derivative
Copy link
Collaborator Author

@pavelkomarov pavelkomarov Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where I realized a couple of the utilities would need to be extended too. Bit of thinking. Required care.

@pavelkomarov pavelkomarov merged commit 28c186c into master Dec 3, 2025
2 checks passed
@pavelkomarov pavelkomarov deleted the multidimensional branch December 3, 2025 06:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants