Is your feature request related to a problem? Please describe.
I'd love for a GPU accelerated implementation of PHATE as many of my colleagues arn't able to scale it well and have it run in a reasonable amount of time.
Describe the solution you'd like
I'd like a CuPy implementation of PHATE (something like rsc.tl.phate) that can be ran out of core via dask and scale across multiple GPUs on a node in an HPC.
Is there a CPU based implementation
Implementation here: PHATE docs.
If this is in scope for rapids-singlecell, I'd love to contribute this feature.
Is your feature request related to a problem? Please describe.
I'd love for a GPU accelerated implementation of PHATE as many of my colleagues arn't able to scale it well and have it run in a reasonable amount of time.
Describe the solution you'd like
I'd like a CuPy implementation of PHATE (something like
rsc.tl.phate) that can be ran out of core via dask and scale across multiple GPUs on a node in an HPC.Is there a CPU based implementation
Implementation here: PHATE docs.
If this is in scope for rapids-singlecell, I'd love to contribute this feature.