Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 17 additions & 2 deletions integrations/llms/gemini.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2135,17 +2135,32 @@ Note that you will have to set [`strict_open_ai_compliance=False`](/product/ai-g

### Using reasoning_effort Parameter

You can also control thinking using the OpenAI-compatible `reasoning_effort` parameter instead of `thinking.budget_tokens`. The value is passed directly to Gemini as the `thinkingLevel`:
You can also control thinking using the OpenAI-compatible `reasoning_effort` parameter instead of `thinking.budget_tokens`:

```python
response = portkey.chat.completions.create(
model="gemini-2.5-flash-preview-04-17",
max_tokens=3000,
reasoning_effort="medium", # Options: "none", "minimal", "low", "medium", "high"
reasoning_effort="medium", # Options: "none", "low", "medium", "high"
messages=[{"role": "user", "content": "Explain quantum computing"}]
)
```

#### Gemini 2.5 Models

For Gemini 2.5 models, `reasoning_effort` maps to `thinking_budget` with specific token allocations:

| reasoning_effort | thinking_budget (tokens) |
|------------------|--------------------------|
| `none` | Disabled |
| `low` | 1,024 |
| `medium` | 8,192 |
| `high` | 24,576 |

#### Gemini 3.0+ Models

For Gemini 3.0 and later models, `reasoning_effort` maps directly to `thinkingLevel`:

| reasoning_effort | Gemini thinkingLevel |
|------------------|---------------------|
| `none` | Disabled |
Expand Down
19 changes: 17 additions & 2 deletions integrations/llms/vertex-ai.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -723,17 +723,32 @@ Note that you will have to set [`strict_open_ai_compliance=False`](/product/ai-g

### Using reasoning_effort Parameter

You can also control thinking using the OpenAI-compatible `reasoning_effort` parameter instead of `thinking.budget_tokens`. The value is passed directly to Gemini as the `thinkingLevel`:
You can also control thinking using the OpenAI-compatible `reasoning_effort` parameter instead of `thinking.budget_tokens`:

```python
response = portkey.chat.completions.create(
model="@VERTEX_PROVIDER/google.gemini-2.5-flash-preview-04-17",
max_tokens=3000,
reasoning_effort="medium", # Options: "none", "minimal", "low", "medium", "high"
reasoning_effort="medium", # Options: "none", "low", "medium", "high"
messages=[{"role": "user", "content": "Explain quantum computing"}]
)
```

#### Gemini 2.5 Models

For Gemini 2.5 models, `reasoning_effort` maps to `thinking_budget` with specific token allocations:

| reasoning_effort | thinking_budget (tokens) |
|------------------|--------------------------|
| `none` | Disabled |
| `low` | 1,024 |
| `medium` | 8,192 |
| `high` | 24,576 |

#### Gemini 3.0+ Models

For Gemini 3.0 and later models, `reasoning_effort` maps directly to `thinkingLevel`:

| reasoning_effort | Vertex thinkingLevel |
|------------------|---------------------|
| `none` | Disabled |
Expand Down