`--rate-type=sweep` does not load vllm with expected count of requests

**Describe the bug**
I ran sweep test to understand the maximum throughput of the service. guidellm calculated 0.5 RPS (though `vllm/benchmarks/benchmark_serving.py ` showed 0.7 RPS when iterating through concurrency).
I logged into the vllm graphana to observe the load on the service. guidellm says 0.5-0.55 RPS. But in fact, I see 13-15 requests.

**Expected behavior**
0.5 RPS = 30 RPM. Shouldn't 30 requests be processed if the service holds 0.5 RPS?

**Environment**
Include all relevant environment information:
1. OS [e.g. Ubuntu 20.04]: -
2. Python version [e.g. 3.12.2]: 3.10
3. Docker-image: python:3.10-slim

**To Reproduce**
Exact steps to reproduce the behavior:
```
> pip install -U git+https://github.com/vllm-project/guidellm.git@1261fe81c57b07ed64333b5d50846699aa5307d4
> export GUIDELLM__PREFERRED_ROUTE="chat_completions" && export GUIDELLM__OPENAI__MAX_OUTPUT_TOKENS=512 && export GUIDELLM_MAX_REQUESTS=1000 && export GUIDELLM__REQUEST_TIMEOUT=600 && guidellm benchmark --target http://localhost:8000 --rate-type sweep --model Qwen/Qwen3-30B-A3B --processor Qwen/Qwen3-30B-A3B --random-seed 2025 --max-seconds 300 --data "prompt_tokens=4096,output_tokens=512,samples=1000" --backend-args '{"extra_body":{"chat_template_kwargs":{"enable_thinking":false}}}' --output-path "data/benchmarks.json"
```

**Errors**
guidellm:
<img width="1113" height="404" alt="Image" src="https://github.com/user-attachments/assets/f42f9f17-ca5a-4dc7-b598-d82e44ead44d" />

vllm grafana:
<img width="940" height="391" alt="Image" src="https://github.com/user-attachments/assets/f4a9e7b2-136a-4b2c-aa99-8399ad5b3f00" />

**Additional context**
Add any other context about the problem here. Also include any relevant files.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`--rate-type=sweep` does not load vllm with expected count of requests #272

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

--rate-type=sweep does not load vllm with expected count of requests #272

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`--rate-type=sweep` does not load vllm with expected count of requests #272