bug: Log probabilities are not returned by vLLM when top_logprobs is set to null

When the [responses_api_models/vllm_model/configs/vllm_model_for_training.yaml](https://github.com/NVIDIA-NeMo/Gym/blob/main/responses_api_models/vllm_model/configs/vllm_model_for_training.yaml) configuration is used with a vLLM server (for example, when training using NeMo RL), log probabilities are returned by the vLLM server when the `top_logprobs` field is not included in the request to the `/v1/responses` endpoint of the responses API model.  However, if the `top_logprobs` field is included in the request with a value of `null` (the default value specified in [nemo_gym/openai_utils.py](https://github.com/NVIDIA-NeMo/Gym/blob/2e96cb6b85b5acf510e035f095393d0677202e53/nemo_gym/openai_utils.py#L268)), then log probabilities are not returned by the vLLM server, and an error occurs because Gym expects log probabilities to be present in the chat completions choice object returned by the vLLM server.

To observe this behavior, one can run the command
```
ng_run "+config_paths=[responses_api_models/vllm_model/configs/vllm_model_for_training.yaml]" +policy_base_url=<URL of vLLM server> +policy_api_key=<API key> +policy_model_name=<model name>
```

to start the vLLM responses API model for training.  Then, the first request in the [responses_api_models/vllm_model/client.py](https://github.com/NVIDIA-NeMo/Gym/blob/main/responses_api_models/vllm_model/client.py) script can be changed to
```
    task_1a = await server_client.post(
        server_name="policy_model",
        url_path="/v1/responses",
        json={
            "input": [{"role": "user", "content": "hello"}],
            "top_logprobs": None,
        },
    )
```

so that the `top_logprobs` field is present in the request to the responses API model with a value of `null`.  The modified script can then be run using the command
```
python responses_api_models/vllm_model/client.py
```

to produce an error with a message such as the following:
```
(policy_model)   File "responses_api_models/vllm_model/app.py", line 109, in
 responses
(policy_model)     chat_completion_response = await self.chat_completions(request, chat_completion_create_params)
(policy_model)                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(policy_model)   File "responses_api_models/vllm_model/app.py", line 270, in
 chat_completions
(policy_model)     log_probs = choice_dict["logprobs"]["content"]
(policy_model)                 ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^
(policy_model) TypeError: 'NoneType' object is not subscriptable
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: Log probabilities are not returned by vLLM when top_logprobs is set to null #609

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bug: Log probabilities are not returned by vLLM when top_logprobs is set to null #609

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions