Fusing PTv3 point serialization into custom CUDA kernel

Hello,
thanks for sharing this great work!

I wanted to link this here since LitePT focuses on efficiency.

Original issue: https://github.com/Pointcept/Pointcept/issues/578
CUDA implementation: https://github.com/ChristianSchott/point_serialization_cuda

Using the base PTv3 model, switching the serialization step to CUDA gives about a ~22% speedup. On LitePT, the gains are slightly smaller since not all downsampling steps involve serialization, but LitePT-S still sees around 15% improvement.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fusing PTv3 point serialization into custom CUDA kernel #8

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Fusing PTv3 point serialization into custom CUDA kernel #8

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions