FlashRT is a high-performance realtime inference engine for small-batch, latency-sensitive AI workloads. The flagship integration is production VLA control for Pi0, Pi0.5, GROOT N1.6, and Pi0-FAST. Also support llm e.g, qwen3.6-27B
cuda pi thor cuda-kernels wan vla jetson motus qwen gr00t wan22-5b realtime-inference pi05 jetson-thor qwen3-6 gr00t-n1-6-3b realtime-vla qwen3-6-27b
-
Updated
May 23, 2026 - C++