A curated registry of GPU cloud providers, serverless GPU inference platforms, Kubernetes GPU platforms, and supporting tooling for training, fine-tuning, and serving AI models.
The goal is to be the definitive place to discover on-demand, reserved, and serverless GPU compute—from hyperscalers and GPU-first clouds to marketplaces and container images.
Inclusion criteria: Resources must be publicly accessible, actively maintained, and directly related to GPU cloud compute or GPU-accelerated AI infrastructure. General CPU cloud services and closed, undocumented offerings are out of scope. See CONTRIBUTING.md for the full quality bar.
- Major Cloud Providers
- Serverless GPU Inference
- GPU Cloud Marketplaces
- Kubernetes GPU Platforms
- Budget / Consumer GPU Cloud
- Enterprise HPC Cloud
- AI Training Platforms
- Container Registries / Images
- Notes
- Contributing
- License
Hyperscale clouds with managed GPU instances for training, inference, and HPC.
- AWS EC2 Accelerated Computing
Official— P5/P5e (H100/H200), P4d (A100), G6e (L40S), and Trn/Inf accelerators with EFA networking and UltraClusters.- Pricing: on-demand, spot, reserved, and capacity blocks.
- Google Cloud GPU Instances
Official— A3 (H100), A3 Ultra (H200), A4 (B200), A4X (GB200), G2 (L4), and G4 (RTX PRO 6000) machine types.- Tight integration with Vertex AI, GKE, and TPUs.
- Azure NC/ND GPU Virtual Machines
Official— NCads H100 v5, ND H100 v5, NC A100 v4, and NVads A10 v5 series with InfiniBand options.- Strong fit for Windows/Linux workstations and Azure ML training jobs.
- Oracle Cloud Infrastructure GPU
Official— Bare-metal and VM instances with H100, H200, A100, L40S, and AMD MI300X GPUs; RDMA cluster networking.- Often priced below AWS/GCP/Azure for equivalent shapes; free egress tiers.
- IBM Cloud GPU
Official— NVIDIA H100 and A100 instances for AI training, inference, and VDI workloads. - Alibaba Cloud GPU
Official— GN7 (A10), GN10e (V100), and GPU-accelerated ECS instances for AI and rendering in Asia-Pacific.
Platforms that abstract away infrastructure and scale GPU workloads to zero.
- Modal
Official— Serverless GPU functions with a Python SDK; deploy inference, training, and batch jobs on L40S, A100, and H100.- Pricing: per-second billing; H100 from ~$3.95/hr, L40S from ~$1.95/hr.
- RunPod Serverless
Official— Autoscaling GPU endpoints with FlashBoot; supports 30+ GPU types from RTX 4090 to H200.- Pricing: billed by the millisecond; H100 from ~$1.99/hr on Community Cloud.
- Baseten
Official— Model-serving platform with low-latency inference, async workers, and the open-source Truss packaging framework. - Replicate
Official— Run and deploy open-source ML models as scalable APIs; supports custom models and a large public model zoo. - Fal
Official— Generative-media inference engine optimized for diffusion and real-time image/video workloads. - Koyeb
Official— Serverless cloud with global GPU deployment and scale-to-zero; supports H100, A100, and Tenstorrent accelerators.- Pricing: A100 ~$2/hr, H100 ~$3.30/hr.
- Novita AI
Official— GPU cloud and serverless inference with 200+ ready-to-use models and on-demand GPU instances. - Google Cloud Run GPUs
Official— Serverless container platform with NVIDIA L4 GPU support and request-driven or job-based GPU workloads. - Cerebrium
Official— Serverless GPU platform for AI inference and training with infrastructure-as-code and fast cold starts. - Beam Cloud
Official— Serverless GPU cloud for training and inference with per-second billing and Python SDK.
Aggregators and peer-to-peer marketplaces for finding GPU capacity across providers.
- Vast.ai
Community/Marketplace— Peer-to-peer GPU marketplace with 20,000+ GPUs; on-demand, interruptible, and reserved pricing.- Pricing: RTX 4090 from ~$0.27/hr, A100 80GB from ~$0.70/hr.
- TensorDock
Community/Marketplace— Marketplace of independent hosts offering H100, A100, RTX 4090, and more across 100+ locations.- Pricing: H100 from ~$2.20/hr; no quotas or long-term contracts.
- Shadeform
Community/Marketplace— GPU cloud marketplace deploying across 30+ clouds with one console, API, and bill. - Prime Intellect
Community— Decentralized compute exchange aggregating 12+ providers for distributed training and inference. - GPUFindr
Community— Live price comparison across CoreWeave, Lambda, RunPod, Vast.ai, and other GPU clouds; free API and MCP server. - NodeHawk
Community— Price aggregator scanning 50+ clouds to surface the cheapest GPU deals.
Managed Kubernetes services and schedulers purpose-built for GPU workloads.
- CoreWeave Kubernetes Service
Official— Kubernetes-native GPU cloud with bare-metal NVIDIA H100, H200, B200, and GB200 nodes. - RunPod
Official— GPU cloud with Kubernetes support, virtual kubelet integrations, and on-demand Pods/Serverless/Clusters. - Nebius AI Cloud
Official— Full-stack AI cloud with managed Kubernetes and Slurm, H100/H200/B200/GB200 clusters, and InfiniBand. - Akamai Cloud GPU
Official— GPU instances and managed Kubernetes (LKE) nodes with RTX PRO 6000 Blackwell and L40S GPUs. - Crusoe Cloud Managed Kubernetes
Official— Managed Kubernetes for GPU and CPU workloads with InfiniBand-backed clusters. - NVIDIA Run:ai
Official— Kubernetes-native GPU orchestration and workload scheduler for maximizing AI compute utilization. - SUNK (Slurm on Kubernetes)
Official— CoreWeave's Slurm-on-Kubernetes solution for running HPC and AI training on the same cluster.
Low-cost options that prioritize price-per-FLOP, often using consumer GPUs.
- Vast.ai
Community— Marketplace with consumer RTX 3090/4090 and data-center GPUs at market-driven prices.- Pricing: RTX 3090 from ~$0.07/hr, RTX 4090 from ~$0.27/hr.
- SaladCloud
Community— Distributed cloud using consumer gaming GPUs across 450,000+ providers.- Pricing: RTX 4090 from ~$0.16/hr, H100 from ~$0.99/hr.
- JarvisLabs
Community— Per-minute billing GPU cloud for individual developers and small teams.- Pricing: RTX 3090 from ~$0.29/hr, H100 from ~$2.69/hr.
- Paperspace
Official— Managed GPU notebooks and instances from RTX 4090 to A100/H100. - TensorDock
Community— Marketplace with consumer and data-center GPUs; full VM control. - DataCrunch / Verda
Official— Low-cost GPU instances and clusters in Iceland/Finland.- Pricing: H100 from ~$2.29/hr.
- Hyperstack
Official— EU-focused GPU cloud with H100, A100, and L40S instances; 100% renewable energy.
GPU-first clouds built for large-scale training, HPC, and production AI at scale.
- CoreWeave
Official— AI hyperscaler with Kubernetes-native H100/H200/B200/GB200/GB300 clusters and InfiniBand.- Customers include OpenAI, Mistral AI, and Jane Street.
- Lambda
Official— GPU cloud and superclusters for AI training/inference; 1-Click Clusters and Lambda Stack.- Pricing: H100 SXM from ~$3.29/hr, A100 SXM from ~$1.29/hr.
- NVIDIA DGX Cloud
Official— AI supercomputing-as-a-service with DGX systems, NIM/NeMo, and NVIDIA Cloud Partners. - NVIDIA DGX Cloud Lepton
Official— Compute marketplace connecting developers to tens of thousands of GPUs across global cloud partners. - FluidStack
Official— Exascale GPU clusters for frontier AI training, deployed in 48 hours across global data centers. - Crusoe Energy
Official— AI cloud powered by stranded and renewable energy; H100/H200/GB200 and managed inference. - GMI Cloud
Official— GPU cloud with H100, H200, B200, and GB200 clusters and managed Kubernetes. - Nscale
Official— GPU cloud built for large-scale AI training and sovereign AI deployments. - Voltage Park
Official— Bare-metal GPU cloud with H100 clusters for training and inference.
Managed platforms that streamline distributed training, fine-tuning, and model development.
- Anyscale
Official— Managed Ray platform for distributed training, batch inference, and serving across clouds. - Together AI GPU Clusters
Official— Self-serve H100/H200/B200/GB200 clusters with InfiniBand and Kubernetes/Slurm orchestration. - Databricks Mosaic AI
Official— Unified platform for fine-tuning, training, and serving LLMs inside the Databricks Lakehouse. - MosaicML Foundry
Community— Open-source training framework for LLMs from 125M to 70B+ parameters. - Hugging Face Training Cluster as a Service
Community— On-demand GPU clusters integrated with Hugging Face tools and datasets. - Weights & Biases
Official— MLOps platform for experiment tracking, model registry, and performance monitoring for GPU training runs.
Registries and pre-built images for GPU-accelerated AI workloads.
- NVIDIA NGC Catalog
Official— Curated GPU-optimized containers, models, SDKs, and Helm charts for PyTorch, TensorFlow, TensorRT, CUDA, and more. - Docker Hub
Official— Public container registry hosting official GPU-enabled images for CUDA, PyTorch, TensorFlow, and community ML projects. - GitHub Container Registry
Official— Container and artifact hosting tightly integrated with GitHub Actions and repositories. - Amazon ECR
Official— Fully managed container registry for storing and deploying GPU container images on AWS. - Google Artifact Registry
Official— Managed artifact and container registry for Google Cloud GPU/AI workloads. - Red Hat Quay
Official— Enterprise container registry with vulnerability scanning and geo-replication.
- awesome-oracle-agentic-skills-mcp — Oracle MCP servers and agentic skills.
- awesome-nvidia — NVIDIA libraries, tools, and frameworks.
- awesome-llm — Large language model resources.
- awesome-local-llm — Running LLMs locally.
- awesome-mcp-servers — Model Context Protocol servers.
- Pricing is indicative: Rates change frequently and vary by region, GPU SKU (PCIe vs SXM), contract length, and spot/interruptible availability. Always verify current pricing on the provider's site.
- Community vs. Official:
Officialdenotes the provider's own service;Community/Marketplacedenotes peer-to-peer, aggregator, or open-source/community-maintained resources. - Reliability trade-offs: Marketplace and consumer-GPU options can offer dramatic cost savings, but may have weaker SLAs, variable networking, and preemptible instances. Use them for fault-tolerant batch work and keep production APIs on SLA-backed platforms.
- Multi-node training: For large distributed training jobs, prioritize providers with NVLink/NVSwitch, InfiniBand/RoCE, and RDMA support (CoreWeave, Lambda, Nebius, Crusoe, AWS P5 UltraClusters, etc.).
- Updates welcome: If you find a new GPU cloud provider or category, see CONTRIBUTING.md and open a pull request.
Read CONTRIBUTING.md for the quality bar, entry format, and PR process.
This list is released into the public domain under CC0-1.0.
Enterprise AI Atlas is maintained by Vibe Coding Agency. We prototype and ship agentic systems, MCP servers, and enterprise AI integrations for teams that need working software fast — without hiring a full AI engineering team.
Free guide: The Non-Technical Founder's Guide to Agentic AI — what agents and MCP servers are, and how to get a system built.