performance improvements for `deconv_rl.py`

Hi Martin,

I was looking at the FFT-based implementation of RL-deconvolution in `deconv_rl.py` and noticed a few things.

* the fft plan is pre-calculated but not actually passed to the fft functions, resulting in some overhead
* for deconvolutions that will be performed repeatedly with the same PSF on the same size of data a lot of code will be run twice. 
* there is a lot of code duplication between the two functions that take np arrays and the ones that take openCL arrays.
* I do not understand why this ` hflip = h[::-1, ::-1]`  is needed. I'm also not sure whether it is correct, I assume for the 3D case this would have to be ` hflip = h[::-1, ::-1, ::-1]`. Maybe you can explain.

To address the first two points I have rewritten your code to test this. The rewritten code is here: https://github.com/VolkerH/Lattice_Lightsheet_Deskew_Deconv/blob/benchmarking/lls_dd/deconv_gputools_rewrite.py
I wasn't sure whether and if so how you would like to integrate this approach of setting up the decon first in gputools, otherwise I would have edited it there and created a pull request.

I have done some benchmarks comparing the rewritten code to the current implementation in gputools and to flowdec: https://github.com/VolkerH/Lattice_Lightsheet_Deskew_Deconv/issues/21. Note that the iteration times are not purely deconvolution but also include IO and affine transforms. This adds plenty of overhead. Without this overhead the speed improvements are even more significant. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

performance improvements for `deconv_rl.py` #17

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

performance improvements for deconv_rl.py #17

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

performance improvements for `deconv_rl.py` #17