Skip to content

DYN-2940: Update vLLM-Omni to v0.20.0#9255

Draft
ptarasiewiczNV wants to merge 14 commits intomainfrom
ptarasiewicz/dyn-2940-vllm-omni-v0200-readiness-for-dynamo-recipes-and-modality
Draft

DYN-2940: Update vLLM-Omni to v0.20.0#9255
ptarasiewiczNV wants to merge 14 commits intomainfrom
ptarasiewicz/dyn-2940-vllm-omni-v0200-readiness-for-dynamo-recipes-and-modality

Conversation

@ptarasiewiczNV
Copy link
Copy Markdown
Contributor

@ptarasiewiczNV ptarasiewiczNV commented May 7, 2026

Summary

  • bump Dynamo container config to vLLM-Omni v0.20.0
  • load vLLM-Omni v0.20 deploy configs through the model-aware loader
  • fill missing default shared-memory connector edges for v0.20 deploy configs
  • adapt disaggregated stage routing and stage handoff to v0.20 processor inputs
  • accumulate multimodal handoff payload chunks before passing Qwen thinker output to the talker stage
  • point the GLM disaggregated launch script at the v0.20 built-in deploy config path

Latest Image

  • nvcr.io/nvidian/dynamo-dev/dynamo:vllm-runtime-ptarasiewicz-dyn2940-v0200-7e74ee9072d2
  • digest: sha256:37edb1122b3ad9d9d6ae721ae94e9568b9d62407cea3bcade74c807e0a820015
  • package check: vllm==0.20.1+cu129, vllm-omni==0.20.0, ai-dynamo==1.2.0

Validation

  • python -m py_compile components/src/dynamo/vllm/omni/stage_worker.py components/src/dynamo/vllm/tests/omni/test_omni_stage_worker.py: passed
  • pre-commit run --files components/src/dynamo/vllm/omni/stage_worker.py components/src/dynamo/vllm/tests/omni/test_omni_stage_worker.py: passed
  • commit hooks for 7e74ee9072d2: passed
  • focused host pytest for the new worker test: skipped because host vLLM-Omni deps are unavailable
  • source-mounted container smoke for connector payload accumulation: passed
  • built-image container smoke for connector payload accumulation: passed
  • Docker build for dynamo:dyn2940-vllm-omni-v020-7e74ee9072d2: passed
  • Nebius H200 Qwen3-Omni text-only validation on previous image: passed
  • Nebius H200 Qwen3-Omni text+audio validation on previous image: reproduced talker handoff failure; 7e74ee9072d2 is the targeted fix and still needs rerun
  • Nebius H200 GLM image validation on previous image: returned HTTP 200 with image payload; needs rerun on 7e74ee9072d2

Current Blocker

  • Kubernetes rerun is blocked by expired Teleport credentials for dynamo-nebius-2.
  • Updated temp manifests are ready locally with the latest image:
    • /tmp/dyn2940-qwen3-v020-job.yaml
    • /tmp/dyn2940-glm-v020-job.yaml

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 7, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions Bot added documentation Improvements or additions to documentation backend::vllm Relates to the vllm backend frontend `python -m dynamo.frontend` and `dynamo-run in=http|text|grpc` multimodal container labels May 7, 2026
@ptarasiewiczNV ptarasiewiczNV force-pushed the ptarasiewicz/dyn-2940-vllm-omni-v0200-readiness-for-dynamo-recipes-and-modality branch from 3a78dc4 to e85157c Compare May 7, 2026 12:53
@ptarasiewiczNV ptarasiewiczNV self-assigned this May 7, 2026
@ptarasiewiczNV ptarasiewiczNV force-pushed the ptarasiewicz/dyn-2940-vllm-omni-v0200-readiness-for-dynamo-recipes-and-modality branch from e85157c to c28c6cf Compare May 7, 2026 13:09
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

Signed-off-by: Piotr Tarasiewicz <ptarasiewicz@nvidia.com>
@ptarasiewiczNV ptarasiewiczNV force-pushed the ptarasiewicz/dyn-2940-vllm-omni-v0200-readiness-for-dynamo-recipes-and-modality branch from c28c6cf to e6c57a8 Compare May 7, 2026 13:16
Signed-off-by: Piotr Tarasiewicz <ptarasiewicz@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend::vllm Relates to the vllm backend container documentation Improvements or additions to documentation frontend `python -m dynamo.frontend` and `dynamo-run in=http|text|grpc` multimodal size/XXL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant