Switch to vLLM with custom stopping criteria #9

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

lee101 wants to merge 1 commit into main from codex/migrate-inference-server-to-vllm

Contributor

lee101 commented Jun 4, 2025

Summary

add vLLM based inference helper with min-probability and sentence stopping
integrate vLLM into the FastAPI server when available
make tests/conftest resilient if httpx is missing
add basic unit test for vLLM inference

Testing

pytest -q (fails: ModuleNotFoundError: No module named 'cachetools')

https://chatgpt.com/codex/tasks/task_e_683fe5b06bf083338fb1ba3540b415dc


          Add vLLM inference integration

c870079

lee101 added the codex label

— with

ChatGPT Codex Connector

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels