You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For specific testing purposes when the Model's answers quality is not relevant, you can deploy an OpenAI-compatible inference simulator instead of a real model server.
> The `patch-1` branch is used until [llm-d/llm-d-inference-sim#348](https://github.com/llm-d/llm-d-inference-sim/pull/348) is resolved to fix the standard example.
574
+
> See the upstream [llm-d testing documentation](https://github.com/llm-d/llm-d-inference-sim?tab=readme-ov-file#kubernetes-testing) for more details.
575
+
576
+
This makes the simulator available as an internal service at:
577
+
578
+
-**Service:**`vllm-llama3-8b-instruct-svc.evalhub-test.svc.cluster.local` (Accessible within the cluster and the namespace only)
0 commit comments