Skip to content

Conversation

@HappyAmazonian
Copy link
Contributor

This PR adds runbook documentation and working examples for deploying vLLM models on Amazon SageMaker, covering three key topics:

  1. Endpoint Creation - Basic deployment workflow
  2. Handler Override - Customizing /ping and /invocations endpoints
  3. Pre/Post Processing - Adding middleware for request/response transformation

What's Included

Documentation (docs/sagemaker/)

  • 01_quickstart.md - Endpoint creation guide
  • 02_customize_handlers.md - Handler override guide
  • 03_customize_pre_post_processing.md - Pre/post processing middleware guide

Jupyter Notebooks (examples/vllm/notebooks/)

  • basic_endpoint.ipynb - End-to-end endpoint creation example
  • copy_image.ipynb - Copy vLLM container images to your ECR
  • handler_customization_methods.ipynb - Handler override examples
  • preprocessing_postprocessing_methods.ipynb - Pre/post processing examples

Model Artifact Examples (examples/vllm/model_artifacts_examples/)

  • handlers_decorator.py - Handler override using decorators
  • handlers_discovery.py - Handler override using function discovery
  • handlers_env_var.py - Handler override using environment variables
  • preprocessing_postprocessing.py - Pre/post processing middleware examples

- Add quickstart guide (01_quickstart.md) with basic deployment instructions
- Add handler customization guide (02_customize_handlers.md) covering 3 methods
- Add pre/post processing guide (03_customize_pre_post_processing.md) for middleware
- All documentation links verified against existing examples and notebooks
- Add handler customization examples (decorator, discovery, env var methods)
- Add preprocessing/postprocessing middleware example
- Add Jupyter notebooks for basic endpoint, image copy, handler customization, and middleware
- Examples demonstrate all 3 handler override methods and middleware patterns
@@ -0,0 +1,357 @@
{

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this required ? Can the public ECRs not be directly used with SageMaker ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It is prohibited by the hosting team. Customer either needs to upload the image or modify VPC setting

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging this since we don't have a workaround for now. Please take an action item to update this once this issue is fixed.

@Lokiiiiii Lokiiiiii merged commit 7c171a7 into aws:main Dec 8, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants