Skip to content

Latest commit

 

History

History
43 lines (32 loc) · 989 Bytes

File metadata and controls

43 lines (32 loc) · 989 Bytes

Examples


(objective pid=45151)# Serve Examples

Below are tutorials for exploring Ray Serve capabilities and learning how to integrate different modeling frameworks.

ML Applications

Serve ML Models

Serve a Stable Diffusion Model
Serve a Text Classification Model
Serve an Object Detection Model
Serve a Chatbot with Request and Response Streaming

AI Accelerators

Serve an Inference Model on AWS NeuronCores Using FastAPI

Serve an Inference with Stable Diffusion Model on AWS NeuronCores Using FastAPI
Serve a model on Intel Gaudi Accelerator

Integrations

Scale a Gradio App with Ray Serve

Serve a Text Generator with Request Batching
Serving models with Triton Server in Ray Serve
Serve a Java App

LLM Applications

Serve DeepSeek

Deploy a small-sized LLM
Deploy a medium-sized LLM
Deploy a large-sized LLM
Deploy a vision LLM
Deploy a reasoning LLM
Deploy a hybrid reasoning LLM
Deploy gpt-oss