[BUG] Compatibility Issue Between max_tokens and max_completion_tokens

# Bug Description

## **Output of `OLS` version**
Tested with road-core/service main branch at ed4ad185c160ba47bd6dd6dadb0a064bf03c673b

## **Describe the bug**

When road-core/service is configured with our self-hosted Granite 3.1 model on OpenShift AI, 
```
llm_providers:
  - name: my_rhoai_g31
    type: rhoai_vllm
    url: https://granite3-1-8b-wisdom-model-staging.apps.stage2-west.v2dz.p1.openshiftapps.com/v1
    credentials_path: /home/ttakamiy/secrets/granite31-8b-token.txt
    models:
      - name: granite3-1-8b
        context_window_size: 128000
```

Gradio UI shows following error as the response to the "Hello" message:

```
Sorry, an error occurred: {"detail":{"response":"[{'type': 'extra_forbidden', 'loc': ('body', 'max_completion_tokens'), 'msg': 'Extra inputs are not permitted', 'input': 4096}]","cause":"Error code: 400 - {'object': 'error', 'message': "[{'type': 'extra_forbidden', 'loc': ('body', 'max_completion_tokens'), 'msg': 'Extra inputs are not permitted', 'input': 4096}]", 'type': 'BadRequestError', 'param': None, 'code': 400}"}}
```

This seems to be the same issue as [this open issue on langchain](https://github.com/langchain-ai/langchain/issues/29954).


## **To Reproduce**

Steps to reproduce the behavior:

1. Configure road-core/service to use the Granite 3.1 model on OpenShift AI.  See the description section above.  Aslo enable debug for Gradio UI and disable authentication.
2. Run road-core/service
3. Open Gradio UI at http://localhost:8080/ui
4. Send "Hello" to LLM --> The error occurs.

## **Expected behavior**

Granite 3.1 model should reply with a greeting message.

## **Screenshots or output**

![Image](https://github.com/user-attachments/assets/4ebb8b14-86a7-4d92-91d3-7144bdb38dd5)

## **Additional context**

Ansible Lightspeed team implemented a workaround suggested in  [the langchain issue](https://github.com/langchain-ai/langchain/issues/29954) with https://github.com/ansible/ansible-chatbot-service/pull/99 .


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Compatibility Issue Between max_tokens and max_completion_tokens #600

Bug Description

Output of `OLS` version

Describe the bug

To Reproduce

Expected behavior

Screenshots or output

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Compatibility Issue Between max_tokens and max_completion_tokens #600

Description

Bug Description

Output of OLS version

Describe the bug

To Reproduce

Expected behavior

Screenshots or output

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Output of `OLS` version