VDLM/README.md at main · NickCheng0921/VDLM

VDLM Overview

VDLM is a model inference framework for running language MDMs (masked diffusion models) w/ an OpenAI style API.

python api_server.py
python test_request.py

Video sped up for demonstration purposes

Written in pytest, run using pytest

tests runs server w/ mock engine loop by default rather than loading a real model

Model generation + load config code is from fast-dLLM.

slight modification added to the original RoPE implementation for torch compilability
- some numerical precision issues observed, see link for more info