Examples

(objective pid=45151)# Serve Examples

Below are tutorials for exploring Ray Serve capabilities and learning how to integrate different modeling frameworks.

ML Applications

Serve ML Models

Serve a Stable Diffusion Model
Serve a Text Classification Model
Serve an Object Detection Model
Serve a Chatbot with Request and Response Streaming

AI Accelerators

Serve an Inference Model on AWS NeuronCores Using FastAPI

Serve an Inference with Stable Diffusion Model on AWS NeuronCores Using FastAPI
Serve a model on Intel Gaudi Accelerator

Integrations

Scale a Gradio App with Ray Serve

Serve a Text Generator with Request Batching
Serving models with Triton Server in Ray Serve
Serve a Java App

LLM Applications

Serve DeepSeek

Deploy a small-sized LLM
Deploy a medium-sized LLM
Deploy a large-sized LLM
Deploy a vision LLM
Deploy a reasoning LLM
Deploy a hybrid reasoning LLM
Deploy gpt-oss

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Examples

ML Applications

Serve ML Models

AI Accelerators

Serve an Inference Model on AWS NeuronCores Using FastAPI

Integrations

Scale a Gradio App with Ray Serve

LLM Applications

Serve DeepSeek

FilesExpand file tree

examples.md

Latest commit

History

examples.md

File metadata and controls

Examples

ML Applications

Serve ML Models

AI Accelerators

Serve an Inference Model on AWS NeuronCores Using FastAPI

Integrations

Scale a Gradio App with Ray Serve

LLM Applications

Serve DeepSeek