Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 81 additions & 0 deletions 04_scaling_performance/02_datacenters/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# 02_datacenters

Pin endpoints to specific RunPod data centers for latency, compliance, or availability reasons.

## Overview

By default, endpoints deploy across all available data centers. The `datacenter` parameter restricts placement to one or more specific DCs. CPU endpoints are limited to a subset of DCs that support CPU serverless (see `CPU_DATACENTERS`).

## Quick Start

```bash
pip install -r requirements.txt
flash run
```

## What You'll Learn

- How to pin a GPU endpoint to a single datacenter
- How to deploy across multiple datacenters
- How CPU datacenter restrictions work

## Available Data Centers

| ID | Location |
|----|----------|
| `US-CA-2` | US - California |
| `US-GA-2` | US - Georgia |
| `US-IL-1` | US - Illinois |
| `US-KS-2` | US - Kansas |
| `US-MD-1` | US - Maryland |
| `US-MO-1` | US - Missouri |
| `US-MO-2` | US - Missouri |
| `US-NC-1` | US - North Carolina |
| `US-NC-2` | US - North Carolina |
| `US-NE-1` | US - Nebraska |
| `US-WA-1` | US - Washington |
| `EU-CZ-1` | Europe - Czech Republic |
| `EU-RO-1` | Europe - Romania |
| `EUR-IS-1` | Europe - Iceland |
| `EUR-NO-1` | Europe - Norway |

CPU endpoints support: `EU-RO-1`.

## Examples

**Single datacenter:**

```python
@Endpoint(name="us-worker", gpu=GpuGroup.ANY, datacenter=DataCenter.US_GA_2)
async def inference(data: dict) -> dict:
...
```

**Multiple datacenters:**

```python
@Endpoint(
name="global-worker",
gpu=GpuGroup.ANY,
datacenter=[DataCenter.US_GA_2, DataCenter.EU_RO_1],
)
async def inference(data: dict) -> dict:
...
```

**No datacenter (default, all DCs):**

```python
@Endpoint(name="anywhere", gpu=GpuGroup.ANY)
async def inference(data: dict) -> dict:
...
```

## Project Structure

```
02_datacenters/
├── gpu_worker.py # single-DC and multi-DC GPU endpoints
├── cpu_worker.py # CPU endpoint in a supported DC
└── README.md
```
29 changes: 29 additions & 0 deletions 04_scaling_performance/02_datacenters/cpu_worker.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# cpu worker pinned to a cpu-supported datacenter.
# cpu endpoints are only available in a subset of datacenters
# (see CPU_DATACENTERS). selecting an unsupported DC raises an error.
# run with: flash run
from runpod_flash import Endpoint, DataCenter

api = Endpoint(
name="04_02_cpu_eu",
cpu="cpu3c-2-4",
workers=(0, 2),
datacenter=DataCenter.EU_RO_1,
)


@api.post("/process")
async def process(data: dict) -> dict:
"""CPU processing pinned to EU-RO-1."""
return {"datacenter": "EU-RO-1", "result": data}


@api.get("/health")
async def health():
return {"status": "ok"}


if __name__ == "__main__":
import asyncio

print(asyncio.run(process({"text": "hello"})))
39 changes: 39 additions & 0 deletions 04_scaling_performance/02_datacenters/gpu_worker.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# gpu workers pinned to specific datacenters.
# run with: flash run
from runpod_flash import Endpoint, GpuGroup, DataCenter


# pin to a single datacenter
@Endpoint(
name="04_02_gpu_us",
gpu=GpuGroup.ANY,
workers=(0, 3),
datacenter=DataCenter.US_GA_2,
)
async def us_inference(payload: dict) -> dict:
"""GPU inference pinned to US-GA-2."""
return {"datacenter": "US-GA-2", "result": payload}


# deploy across multiple datacenters for broader availability
@Endpoint(
name="04_02_gpu_multi",
gpu=GpuGroup.ANY,
workers=(0, 3),
datacenter=[DataCenter.US_GA_2, DataCenter.EU_RO_1],
)
async def multi_dc_inference(payload: dict) -> dict:
"""GPU inference available in US-GA-2 and EU-RO-1."""
return {"result": payload}


if __name__ == "__main__":
import asyncio

async def test():
print("=== US datacenter ===")
print(await us_inference({"prompt": "hello"}))
print("\n=== Multi-DC ===")
print(await multi_dc_inference({"prompt": "hello"}))

asyncio.run(test())
5 changes: 4 additions & 1 deletion 05_data_workflows/01_network_volumes/cpu_worker.py
Original file line number Diff line number Diff line change
@@ -1,18 +1,21 @@
# cpu worker with network volume for listing and serving generated images.
# run with: flash run
# test directly: python cpu_worker.py
from runpod_flash import Endpoint, NetworkVolume
from runpod_flash import Endpoint, DataCenter, NetworkVolume

# same volume as gpu_worker.py -- must match name and datacenter
volume = NetworkVolume(
name="flash-05-volume",
size=50,
datacenter=DataCenter.EU_RO_1,
)

api = Endpoint(
name="05_01_cpu_worker",
cpu="cpu3c-1-2",
workers=(1, 3),
idle_timeout=120,
datacenter=DataCenter.EU_RO_1,
volume=volume,
)

Expand Down
4 changes: 3 additions & 1 deletion 05_data_workflows/01_network_volumes/gpu_worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# test directly: python gpu_worker.py
import logging

from runpod_flash import Endpoint, GpuType, NetworkVolume
from runpod_flash import Endpoint, GpuType, DataCenter, NetworkVolume

logger = logging.getLogger(__name__)

Expand All @@ -12,6 +12,7 @@
volume = NetworkVolume(
name="flash-05-volume",
size=50,
datacenter=DataCenter.EU_RO_1,
)


Expand All @@ -20,6 +21,7 @@
gpu=GpuType.NVIDIA_GEFORCE_RTX_5090,
workers=(0, 3),
idle_timeout=300,
datacenter=DataCenter.EU_RO_1,
volume=volume,
env={"HF_HUB_CACHE": MODEL_PATH, "MODEL_PATH": MODEL_PATH},
dependencies=["torch", "diffusers", "transformers", "accelerate"],
Expand Down
Loading