Cross-platform installer for Triton and SageAttention on ComfyUI. Simplifies GPU-accelerated inference setup for Windows users with automated dependency management and RTX 5090 support.
-
Updated
Mar 31, 2026 - Python
Cross-platform installer for Triton and SageAttention on ComfyUI. Simplifies GPU-accelerated inference setup for Windows users with automated dependency management and RTX 5090 support.
A hands-on guide for AI builders: make your own RTX PRO 6000/4090D/5090 GPU server that’s fast and efficient.
RTX 5090 & RTX 5060 Docker container with PyTorch + TensorFlow. First fully-tested Blackwell GPU support for ML/AI. CUDA 12.8, Python 3.11, Ubuntu 24.04. Works with RTX 50-series (5090/5080/5070/5060) and RTX 40-series.
异环(Neverness To Everness / Ananta)光线追踪一键部署面板,基于 OptiScaler winmm 方案,默认推荐 RTX 5090,并支持本机/RTX 4090/RTX 5080M 配置、备份、恢复和本地 WebUI。
High-performance LLM inference engine in C++/CUDA for NVIDIA Blackwell (RTX 5090/5080/5070 Ti, RTX PRO 6000; sm_120). Native NVFP4/GGUF, 270 tok/s decode on Qwen3-Coder-30B MoE. Written entirely by Claude Code.
Pixal3D ComfyUI integration for Windows (RTX 30/40/50) — single image to textured PBR mesh in 3-5 min
NVFP4 inference on Blackwell GeForce (RTX 5090/5080/5070 Ti/RTX PRO 6000) — SM120 patches for vLLM + FlashInfer + CUTLASS. 175 tok/s on Qwen3.6-35B MoE.
Research: vGPU unlock on consumer NVIDIA RTX 5090 (Blackwell/GB202). 19 binary patches, full CPU-side pipeline working, GSP firmware blocked by fused-off VF PRIV registers.
Enterprise-grade Sovereign AI Stack optimized for NVIDIA Blackwell (sm_120) & vLLM. Features 256K context window, 5.8k tok/s prefill, and integrated observability via Langfuse.
CastelOS public artifacts — principles, architecture insights, and build-in-public content
Production-grade Traditional Chinese / Taiwan Mandarin speech-to-text. Qwen3-ASR + MediaTek Breeze-ASR-25, hot-word injection, LLM polish, speaker diarization. RTF up to 1554x on RTX 5090, 56 TDD tests.
A high-performance local AI pipeline for restoring VHS audio, transcribing with Whisper, and translating subtitles using NLLB-200.
⚡ Compare AI models by Accuracy × Cost × Carbon — RTX 5090 benchmarks reveal 4-bit quantization wastes energy on small models
Optimized CSM-1B TTS pipeline for RTX 5090 (Blackwell sm_120). CUDA graph replay via patched HF Transformers. ~0.46x RTF. Topics (tags): csm text-to-speech rtx-5090 blackwell cuda-graphs torch-compile sesame streaming pytorch
Technical insights from r/LocalLLaMA — vLLM, FP8, NVFP4, Blackwell GPU benchmarks, and more. Unverified community knowledge, generated by Nemotron 9B. Issues welcome.
Design study: autonomous control loop for GPU compute management on Vast.ai. Boundary-first architecture, absence-tolerant design. Frozen as reference architecture.
⚡ NeuralForge — Futuristic Neural Network Training, Visualization & Programmable Framework | PyTorch · PyG · ONNX · Netron · Bootstrap UI
Step-back + role-diverse workers + rubric-judge reasoning pipeline for Claude Code
Lightweight GPU & CPU system tray monitor for NVIDIA GPUs (RTX 5090, RTX 6000, RTX 4090, RTX 3090, Tesla, TCC mode). Real-time power, temperature, VRAM & CPU usage badges. Works where HWMonitor, GPU-Z & MSI Afterburner fail.
Add a description, image, and links to the rtx-5090 topic page so that developers can more easily learn about it.
To associate your repository with the rtx-5090 topic, visit your repo's landing page and select "manage topics."