fix(deps): update module github.com/ollama/ollama to v0.13.4 #66

renovate · 2024-08-28T01:39:07Z

Note: This PR body was truncated due to platform limits.

This PR contains the following updates:

Package	Change	Age	Confidence
github.com/ollama/ollama	`v0.3.6` -> `v0.13.4`

Release Notes

ollama/ollama (github.com/ollama/ollama)

`v0.13.4`

Compare Source

New Models

Nemotron 3 Nano: A new Standard for Efficient, Open, and Intelligent Agentic Models
Olmo 3 and Olmo 3.1: A series of Open language models designed to enable the science of language models. These models are pre-trained on the Dolma 3 dataset and post-trained on the Dolci datasets.

What's Changed

Enable Flash Attention automatically for models by default
Fixed handling of long contexts with Gemma 3 models
Fixed issue that would occur with Gemma 3 QAT models or other models imported with the Gemma 3 architecture

New Contributors

@familom made their first contribution in #13220

Full Changelog: ollama/ollama@v0.13.3...v0.13.4-rc0

`v0.13.3`

Compare Source

New models

Devstral-Small-2: 24B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.
rnj-1: Rnj-1 is a family of 8B parameter open-weight, dense models trained from scratch by Essential AI, optimized for code and STEM with capabilities on par with SOTA open-weight models.
nomic-embed-text-v2: nomic-embed-text-v2-moe is a multilingual MoE text embedding model that excels at multilingual retrieval.

What's Changed

Improved truncation logic when using /api/embed and /v1/embeddings
Extend Gemma 3 architecture to support rnj-1 model
Fix error that would occur when running qwen2.5vl with image input

Full Changelog: ollama/ollama@v0.13.2...v0.13.3

`v0.13.2`

Compare Source

New models

Qwen3-Next: The first installment in the Qwen3-Next series with strong performance in terms of both parameter efficiency and inference speed.

What's Changed

Flash attention is now enabled by default for vision models such as mistral-3, gemma3, qwen3-vl and more. This improves memory utilization and performance when providing images as input.
Fixed GPU detection on multi-GPU CUDA machines
Fixed issue where deepseek-v3.1 would always think even with thinking is disabled in Ollama's app

New Contributors

@chengcheng84 made their first contribution in #13265
@nathan-hook made their first contribution in #13256

Full Changelog: ollama/ollama@v0.13.1...v0.13.2

`v0.13.1`

Compare Source

New models

Ministral-3: The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware.
Mistral-Large-3: A general-purpose multimodal mixture-of-experts model for production-grade tasks and enterprise workloads.

What's Changed

nomic-embed-text will now use Ollama's engine by default
Tool calling support for cogito-v2.1
Fixed issues with CUDA VRAM discovery
Fixed link to docs in Ollama's app
Fixed issue where models would be evicted on CPU-only systems
Ollama will now better render errors instead of showing Unmarshal: errors
Fixed issue where CUDA GPUs would fail to be detected with older GPUs
Added thinking and tool parsing for cogito-v2.1

New Contributors

@EntropyYue made their first contribution in #13237
@kokes made their first contribution in #13231

Full Changelog: ollama/ollama@v0.13.0...v0.13.1

`v0.13.0`

Compare Source

New models

DeepSeek-OCR: DeepSeek-OCR uses optical 2D mapping to compress long contexts, achieving high OCR precision with reduced vision tokens and demonstrating practical value in document processing.
Cogito-V2.1: instruction tuned generative models, currently the best open-weight LLM by a US company

DeepSeek-OCR

DeepSeek-OCR is now available on Ollama. Example inputs:

ollama run deepseek-ocr "/path/to/image\n<|grounding|>Given the layout of the image."

ollama run deepseek-ocr "/path/to/image\nFree OCR."

ollama run deepseek-ocr "/path/to/image\nParse the figure."

ollama run deepseek-ocr "/path/to/image\nExtract the text in the image."

ollama run deepseek-ocr "/path/to/image\n<|grounding|>Convert the document to markdown."

New `bench` tool

Ollama's GitHub repo now includes a bench tool that can be used to test model performance. For the time being this is a separate tool that can be built in the Ollama GitHub repository:

First, install Go. Then from the root of the Ollama repository run:

go run ./cmd/bench -model gpt-oss:20b

For more information see the tool's documentation

What's Changed

DeepSeek-OCR is now supported
DeepSeek-V3.1 architecture is now supported in Ollama's engine
Fixed performance issues that arose in Ollama 0.12.11 on CUDA
Fixed issue where Linux install packages were missing required Vulkan libraries
Improved CPU and memory detection while in containers/cgroups
Improved VRAM information detection for AMD GPUs
Improved KV cache performance to no longer require defragmentation

New Contributors

@lnicola made their first contribution in #13096
@vignesh1507 made their first contribution in #13078
@pierwill made their first contribution in #12995
@jjuliano made their first contribution in #11877
@omahs made their first contribution in #10683
@SiLeader made their first contribution in #10292
@ssam18 made their first contribution in #13124
@seolyam made their first contribution in #13116

Full Changelog: ollama/ollama@v0.12.11...v0.13.0

`v0.12.11`

Compare Source

Logprobs

Ollama's API and OpenAI-compatible API now support log probabilities. Log probabilities of output tokens indicate the likelihood of each token occurring in the sequence given the context. This is useful for different use cases:

Classification tasks
Retrieval (Q&A) evaluation
Autocomplete
Token highlighting and outputting bytes
Calculating perplexity

To enable Logprobs, provide "logprobs": true to Ollama's API:

curl http://localhost:11434/api/generate -d '{
  "model": "gemma3",
  "prompt": "Why is the sky blue?",
  "logprobs": true
}'

When log probabilities are requested, response chunks will now include a "logprobs" field with the token, log probability and raw bytes (for partial unicode).

{
  "model": "gemma3",
  "created_at": "2025-11-14T22:17:56.598562Z",
  "response": "Okay",
  "done": false,
  "logprobs": [
    {
      "token": "Okay",
      "logprob": -1.3434503078460693,
      "bytes": [
        79,
        107,
        97,
        121
      ]
    }
  ]
}

`top_logprobs`

When setting "top_logprobs", a number of most-likely tokens are also provided, making it possible to introspect alternative tokens. Below is an example request.

curl http://localhost:11434/api/generate -d '{
  "model": "gemma3",
  "prompt": "Why is the sky blue?",
  "logprobs": true,
  "top_logprobs": 3
}'

This will generate a stream of response chunks with the following fields:

{
  "model": "gemma3",
  "created_at": "2025-11-14T22:26:10.466324Z",
  "response": "The",
  "done": false,
  "logprobs": [
    {
      "token": "The",
      "logprob": -0.8361086845397949,
      "bytes": [
        84,
        104,
        101
      ],
      "top_logprobs": [
        {
          "token": "The",
          "logprob": -0.8361086845397949,
          "bytes": [
            84,
            104,
            101
          ]
        },
        {
          "token": "Okay",
          "logprob": -1.2590975761413574,
          "bytes": [
            79,
            107,
            97,
            121
          ]
        },
        {
          "token": "That",
          "logprob": -1.2686877250671387,
          "bytes": [
            84,
            104,
            97,
            116
          ]
        }
      ]
    }
  ]
}

Special thanks

Thank you @baptistejamin for adding Logprobs to Ollama's API.

Vulkan support (opt-in)

Ollama 0.12.11 includes support for Vulkan acceleration. Vulkan brings support for a broad range of GPUs from AMD, Intel, and iGPUs. Vulkan support is not yet enabled by default, and requires opting in by running Ollama with a custom environment variable:

OLLAMA_VULKAN=1 ollama serve

On Powershell, use:

$env:OLLAMA_VULKAN="1"
ollama serve

For issues or feedback on using Vulkan with Ollama, create an issue labelled Vulkan and make sure to include server logs where possible to aid in debugging.

What's Changed

Ollama's API and the OpenAI-compatible API now supports Logprobs
Ollama's new app now supports WebP images
Improved rendering performance in Ollama's new app, especially when rendering code
The "required" field in tool definitions will now be omitted if not specified
Fixed issue where "tool_call_id" would be omitted when using the OpenAI-compatible API.
Fixed issue where ollama create would import data from both consolidated.safetensors and other safetensor files.
Ollama will now prefer dedicated GPUs over iGPUs when scheduling models
Vulkan can now be enabled by setting OLLAMA_VULKAN=1. For example: OLLAMA_VULKAN=1 ollama serve

New Contributors

@mags0ft made their first contribution in #11371
@macarronesc made their first contribution in #12973
@breatn made their first contribution in #12985
@cybardev made their first contribution in #13045
@baptistejamin made their first contribution in #12899

Full Changelog: ollama/ollama@v0.12.10...v0.12.11

`v0.12.10`

Compare Source

`ollama run` now works with embedding models

ollama run can now run embedding models to generate vector embeddings from text:

ollama run embeddinggemma "Hello world"

Content can also be provided to ollama run via standard input:

echo "Hello world" | ollama run embeddinggemma

What's Changed

Fixed errors when running qwen3-vl:235b and qwen3-vl:235b-instruct
Enable flash attention for Vulkan (currently needs to be built from source)
Add Vulkan memory detection for Intel GPU using DXGI+PDH
Ollama will now return tool call IDs from the /api/chat API
Fixed hanging due to CPU discovery
Ollama will now show login instructions when switching to a cloud model in interactive mode
Fix reading stale VRAM data
ollama run now works with embedding models

New Contributors

@ryanycoleman made their first contribution in #11740
@Rajathbail made their first contribution in #12929
@virajwad made their first contribution in #12664
@AXYZdong made their first contribution in #8601

Full Changelog: ollama/ollama@v0.12.9...v0.12.10

`v0.12.9`

Compare Source

What's Changed

Fix performance regression on CPU-only systems

Full Changelog: ollama/ollama@v0.12.8...v0.12.9

`v0.12.8`

Compare Source

What's Changed

qwen3-vl performance improvements, including flash attention support by default
qwen3-vl will now output less leading whitespace in the response when thinking
Fixed issue where deepseek-v3.1 thinking could not be disabled in Ollama's new app
Fixed issue where qwen3-vl would fail to interpret images with transparent backgrounds
Ollama will now stop running a model before removing it via ollama rm
Fixed issue where prompt processing would be slower on Ollama's engine
Ignore unsupported iGPUs when doing device discovery on Windows

New Contributors

@athshh made their first contributihttps://github.com/ollama/ollama/pull/12822/12822

Full Changelog: ollama/ollama@v0.12.7...v0.12.8

`v0.12.7`

Compare Source

Ollama screenshot 2025-10-29 at 13 56 55@2x

New models

Qwen3-VL: Qwen3-VL is now available in all parameter sizes ranging from 2B to 235B
MiniMax-M2: a 230 Billion parameter model built for coding & agentic workflows available on Ollama's cloud

Add files and adjust thinking levels in Ollama's new app

Ollama's new app now includes a way to add one or many files when prompting the model:

For better responses, thinking levels can now be adjusted for the gpt-oss models:

New API documentation

New API documentation is available for Ollama's API: https://docs.ollama.com/api

What's Changed

Model load failures now include more information on Windows
Fixed embedding results being incorrect when running embeddinggemma
Fixed gemma3n on Vulkan backend
Increased time allocated for ROCm to discover devices
Fixed truncation error when generating embeddings
Fixed request status code when running cloud models
The OpenAI-compatible /v1/embeddings endpoint now supports encoding_format parameter
Ollama will now parse tool calls that don't conform to {"name": name, "arguments": args} (thanks @rick-github!)
Fixed prompt processing reporting in the llama runner
Increase speed when scheduling models
Fixed issue where FROM <model> would not inherit RENDERER or PARSER commands

New Contributors

@npardal made their first contributihttps://github.com/ollama/ollama/pull/12715/12715

Full Changelog: ollama/ollama@v0.12.6...v0.12.7

`v0.12.6`

Compare Source

What's Changed

Ollama's app now supports searching when running DeepSeek-V3.1, Qwen3 and other models that support tool calling.
Flash attention is now enabled by default for Gemma 3, improving performance and memory utilization
Fixed issue where Ollama would hang while generating responses
Fixed issue where qwen3-coder would act in raw mode when using /api/generate or ollama run qwen3-coder <prompt>
Fixed qwen3-embedding providing invalid results
Ollama will now evict models correctly when num_gpu is set
Fixed issue where tool_index with a value of 0 would not be sent to the model

Experimental Vulkan Support

Experimental support for Vulkan is now available when you build locally from source. This will enable additional GPUs from AMD, and Intel which are not currently supported by Ollama. To build locally, install the Vulkan SDK and set VULKAN_SDK in your environment, then follow the developer instructions. In a future release, Vulkan support will be included in the binary release as well. Please file issues if you run into any problems.

New Contributors

@yajianggroup made their first contribution in #12377
@inforithmics made their first contribution in #11835
@sbhavani made their first contribution in #12619

Full Changelog: ollama/ollama@v0.12.5...v0.12.6

`v0.12.5`

Compare Source

What's Changed

Thinking models now support structured outputs when using the /api/chat API
Ollama's app will now wait until Ollama is running to allow for a conversation to be started
Fixed issue where "think": false would show an error instead of being silently ignored
Fixed deepseek-r1 output issues
macOS 12 Monterey and macOS 13 Ventura are no longer supported
AMD gfx900 and gfx906 (MI50, MI60, etc) GPUs are no longer supported via ROCm. We're working to support these GPUs via Vulkan in a future release.

New Contributors

@shengxinjing made their first contribution in #12415

Full Changelog: ollama/ollama@v0.12.4...v0.12.5-rc0

`v0.12.4`

Compare Source

What's Changed

Flash attention is now enabled by default for Qwen 3 and Qwen 3 Coder
Fixed minor memory estimation issues when scheduling models on NVIDIA GPUs
Fixed an issue where keep_alive in the API would accept different values for the /api/chat and /api/generate endpoints
Fixed tool calling rendering with qwen3-coder
More reliable and accurate VRAM detection
OLLAMA_FLASH_ATTENTION can now be overridden to 0 for models that have flash attention enabled by default
macOS 12 Monterey and macOS 13 Ventura are no longer supported
Fixed crash where templates were not correctly defined
Fix memory calculations on NVIDIA iGPUs
AMD gfx900 and gfx906 (MI50, MI60, etc) GPUs are no longer supported via ROCm. We're working to support these GPUs via Vulkan in a future release.

New Contributors

@Fachep made their first contribution in #12412

Full Changelog: ollama/ollama@v0.12.3...v0.12.4-rc3

`v0.12.3`

Compare Source

New models

DeepSeek-V3.1-Terminus: DeepSeek-V3.1-Terminus is a hybrid model that supports both thinking mode and non-thinking mode. It delivers more stable & reliable outputs across benchmarks compared to the previous version:

Run on Ollama's cloud:
```
ollama run deepseek-v3.1:671b-cloud
```
Run locally (requires 500GB+ of VRAM)
```
ollama run deepseek-v3.1
```
Kimi-K2-Instruct-0905: Kimi K2-Instruct-0905 is the latest, most capable version of Kimi K2. It is a state-of-the-art mixture-of-experts (MoE) language model, featuring 32 billion activated parameters and a total of 1 trillion parameters.
```
ollama run kimi-k2:1t-cloud
```

What's Changed

Fixed issue where tool calls provided as stringified JSON would not be parsed correctly
ollama push will now provide a URL to follow to sign in
Fixed issues where qwen3-coder would output unicode characters incorrectly
Fix issue where loading a model with /load would crash

New Contributors

@gr4ceG made their first contribution in #12385

Full Changelog: ollama/ollama@v0.12.2...v0.12.3

`v0.12.2`

Compare Source

Web search

A new web search API is now available in Ollama. Ollama provides a generous free tier of web searches for individuals to use, and higher rate limits are available via Ollama’s cloud. This web search capability can augment models with the latest information from the web to reduce hallucinations and improve accuracy.

What's Changed

Models with Qwen3's architecture including MoE now run in Ollama's new engine
Fixed issue where built-in tools for gpt-oss were not being rendered correctly
Support multi-regex pretokenizers in Ollama's new engine
Ollama's new engine can now load tensors by matching a prefix or suffix

Full Changelog: ollama/ollama@v0.12.1...v0.12.2

`v0.12.1`

Compare Source

New models

Qwen3 Embedding: state of the art open embedding model by the Qwen team

What's Changed

Qwen3-Coder now supports tool calling
Ollama's app will now longer show "connection lost" in error when connecting to cloud models
Fixed issue where Gemma3 QAT models would not output correct tokens
Fix issue where & characters in Qwen3-Coder would not be parsed correctly when function calling
Fixed issues where ollama signin would not work properly on Linux

Full Changelog: ollama/ollama@v0.12.0...v0.12.1

`v0.12.0`

Compare Source

Cloud models

Cloud models are now available in preview, allowing you to run a group of larger models with fast, datacenter-grade hardware.

To run a cloud model, use:

ollama run qwen3-coder:480b-cloud

View all cloud models
Blog post

What's Changed

Models with the Bert architecture now run on Ollama's engine
Models with the Qwen 3 architecture now run on Ollama's engine
Fix issue where older NVIDIA GPUs would not be detected if newer drivers were installed
Fixed issue where models would not be imported correctly with ollama create
Ollama will skip parsing the initial <think> if provided in the prompt for /api/generate by @rick-github

New Contributors

@egyptianbman made their first contribution in #12300
@russcoss made their first contribution in #12280

Full Changelog: ollama/ollama@v0.11.11...v0.12.0

`v0.11.11`

Compare Source

What's Changed

Support for CUDA 13
Improved memory usage when using gpt-oss in Ollama's app
Better scrolling better in Ollama's app when submitting long prompts
Cmd +/- will now zoom and shrink text in Ollama's app
Assistant messages can now by copied in Ollama's app
Fixed error that would occur when attempting to import satefensor files by @rick-github in #12176
Improved memory estimates for hybrid and recurrent models by @gabe-l-hart in #12186
Fixed error that would occur when when batch size was greater than context length
Flash attention & KV cache quantization validation fixes by @jessegross in #12231
Add dimensions field to embed requests by @mxyng in #12242
Enable new memory estimates in Ollama's new engine by default by @jessegross in #12252
Ollama will no longer load split vision models in the Ollama engine by @jessegross in #12241

New Contributors

@KashyapTan made their first contribution in #12188
@carbonatedWaterOrg made their first contribution in #12230
@fengyuchuanshen made their first contribution in #12249

Full Changelog: ollama/ollama@v0.11.10...v0.11.11

`v0.11.10`

Compare Source

New models

EmbeddingGemma a new open embedding model that delivers best-in-class performance for its size

What's Changed

Support for EmbeddingGemma

Full Changelog: ollama/ollama@v0.11.9...v0.11.10

`v0.11.9`

Compare Source

What's Changed

Improved performance via overlapping GPU and CPU computations
Fixed issues where unrecognized AMD GPU would cause an error
Reduce crashes due to unhandled errors in some Mac and Linux installations of Ollama

New Contributors

@alpha-nerd-nomyo made their first contribution in #12129
@pxwanglu made their first contribution in #12123

Full Changelog: ollama/ollama@v0.11.8...v0.11.9-rc0

`v0.11.8`

Compare Source

What's Changed

gpt-oss now has flash attention enabled by default for systems that support it
Improved load times for gpt-oss

Full Changelog: ollama/ollama@v0.11.7...v0.11.8

`v0.11.7`

Compare Source

DeepSeek-V3.1

DeepSeek-V3.1 is now available to run via Ollama.

This model supports hybrid thinking, meaning thinking can be enabled or disabled by setting think in Ollama's API:

curl http://localhost:11434/api/chat -d '{
  "model": "deepseek-v3.1",
  "messages": [
    {
      "role": "user",
      "content": "why is the sky blue?"
    }
  ],
  "think": true
}'

In Ollama's CLI, thinking can be enabled or disabled by running the /set think or /set nothink commands.

Turbo (in preview)

DeepSeek-V3.1 has over 671B parameters, and so a large amount of VRAM is required to run it. Ollama's Turbo mode (in preview) provides access to powerful hardware in the cloud you can use to run the model.

Turbo via Ollama's app

Download Ollama for macOS or Windows
Select deepseek-v3.1:671b from the model selector
Enable Turbo

Turbo via Ollama's CLI and libraries

Create an account on ollama.com/signup
Follow the docs for Ollama's CLI to upload authenticate your Ollama installation
Run the following:

OLLAMA_HOST=ollama.com ollama run deepseek-v3.1

For instructions on using Turbo with Ollama's Python and JavaScript library, see the docs

What's Changed

Fixed issue where multiple models would not be loaded on CPU-only systems
Ollama will now work with models who skip outputting the initial<think> tag (e.g. DeepSeek-V3.1)
Fixed issue where text would be emitted when there is no opening <think> tag from a model
Fixed issue where tool calls containing { or } would not be parsed correctly

New Contributors

@zoupingshi made their first contribution in #12028

Full Changelog: ollama/ollama@v0.11.6...v0.11.7

`v0.11.6`

Compare Source

What's Changed

Ollama's app will now switch between chats faster
Improved layout of messages in Ollama's app
Fixed issue where command prompt would show when Ollama's app detected an old version of Ollama running
Improved performance when using flash attention
Fixed boundary case when encoding text using BPE

Full Changelog: ollama/ollama@v0.11.5...v0.11.6

`v0.11.5`

Compare Source

What's Changed

Performance improvements for the gpt-oss models
New memory management: this release of Ollama includes improved memory management for scheduling models on GPUs, leading to better VRAM utilization, model performance and less out of memory errors. These new memory estimations can be enabled with OLLAMA_NEW_ESTIMATES=1 ollama serve and will soon be enabled by default.
Improved multi-GPU scheduling and reduced VRAM allocation when using more than 2 GPUs
Ollama's new app will now remember default selections for default model, Turbo and Web Search between restarts
Fix error when parsing bad harmony tool calls
OLLAMA_FLASH_ATTENTION=1 will also enable flash attention for pure-CPU models
Fixed OpenAI-compatible API not supporting reasoning_effort
Reduced size of installation on Windows and Linux

New Contributors

@vorburger made their first contribution in #11755
@dan-and made their first contribution in #10678
@youzichuan made their first contribution in #11880

Full Changelog: ollama/ollama@v0.11.4...v0.11.5

`v0.11.4`

Compare Source

What's Changed

openai: allow for content and tool calls in the same message by @drifkin in #11759
openai: when converting role=tool messages, propagate the tool name by @drifkin in #11761
openai: always provide reasoning by @drifkin in #11765

New Contributors

@gao-feng made their first contribution in #11170

Full Changelog: ollama/ollama@v0.11.3...v0.11.4

`v0.11.3`

Compare Source

What's Changed

Fixed issue where gpt-oss would consume too much VRAM when split across GPU & CPU or multiple GPUs
Statically link C++ libraries on windows for better compatibility

Full Changelog: ollama/ollama@v0.11.2...v0.11.3

`v0.11.2`

Compare Source

What's Changed

Fix crash in gpt-oss when using kv cache quanitization
Fix gpt-oss bug with "currentDate" not defined

Full Changelog: ollama/ollama@v0.11.1...v0.11.2

`v0.11.1`

Compare Source

`v0.11.0`

Compare Source

Welcome OpenAI's gpt-oss models

Ollama partners with OpenAI to bring its latest state-of-the-art open weight models to Ollama. The two models, 20B and 120B, bring a whole new local chat experience, and are designed for powerful reasoning, agentic tasks, and versatile developer use cases.

Feature highlights

Agentic capabilities: Use the models’ native capabilities for function calling, web browsing (Ollama is providing a built-in web search that can be optionally enabled to augment the model with the latest information), python tool calls, and structured outputs.
Full chain-of-thought: Gain complete access to the model's reasoning process, facilitating easier debugging and increased trust in outputs.
Configurable reasoning effort: Easily adjust the reasoning effort (low, medium, high) based on your specific use case and latency needs.
Fine-tunable: Fully customize models to your specific use case through parameter fine-tuning.
Permissive Apache 2.0 license: Build freely without copyleft restrictions or patent risk—ideal for experimentation, customization, and commercial deployment.

Quantization - MXFP4 format

OpenAI utilizes quantization to reduce the memory footprint of the gpt-oss models. The models are post-trained with quantization of the mixture-of-experts (MoE) weights to MXFP4 format, where the weights are quantized to 4.25 bits per parameter. The MoE weights are responsible for 90+% of the total parameter count, and quantizing these to MXFP4 enables the smaller model to run on systems with as little as 16GB memory, and the larger model to fit on a single 80GB GPU.

Ollama is supporting the MXFP4 format natively without additional quantizations or conversions. New kernels are developed for Ollama’s new engine to support the MXFP4 format.

Ollama collaborated with OpenAI to benchmark against their reference implementations to ensure Ollama’s implementations have the same quality.

Get started

You can get started by downloading the latest Ollama version (v0.11)

The model can be downloaded directly in Ollama’s new app or via the terminal:

ollama run gpt-oss:20b

ollama run gpt-oss:120b

What's Changed

kvcache: Enable SWA to retain additional entries by @jessegross in #11611
kvcache: Log contents of cache when unable to find a slot by @jessegross in #11658

Full Changelog: ollama/ollama@v0.10.1...v0.11.0

`v0.10.1`

Compare Source

What's Changed

Fixed unicode character input for Japanese and other languages in Ollama's new app
Fixed AMD download URL in the logs for ollama serve

New Contributors

@skools-here made their first contribution in #11579

Full Changelog: ollama/ollama@v0.10.0...v0.10.1

`v0.10.0`

Compare Source

Ollama's new app

Ollama's new app is available for macOS and Windows: Download Ollama

What's Changed

ollama ps will now show the context length of loaded models
Improved performance in gemma3n models by 2-3x
Parallel request processing now defaults to 1. For more details, see the FAQ
Fixed issue where tool calling would not work correctly with granite3.3 and mistral-nemo models
Fixed issue where Ollama's tool calling would not work correctly if a tool's name was part of of another one, such as add and get_address
Improved performance when using multiple GPUs by 10-30%
Ollama's OpenAI-compatible API will now support WebP images
Fixed issue where ollama show would report an error
ollama run will more gracefully display errors

New Contributors

@sncix made their first contributihttps://github.com/ollama/ollama/pull/11189/11189
@mfornet made their first contributihttps://github.com/ollama/ollama/pull/11425/11425
@haiyuewa made their first contributihttps://github.com/ollama/ollama/pull/11427/11427
@warting made their first contributihttps://github.com/ollama/ollama/pull/11461/11461
@ycomiti made their first contributihttps://github.com/ollama/ollama/pull/11462/11462
@minxinyi made their first contributihttps://github.com/ollama/ollama/pull/11502/11502
@ruyut made their first contributihttps://github.com/ollama/ollama/pull/11528/11528

Full Changelog: ollama/ollama@v0.9.6...v0.10.0

`v0.9.6`

Compare Source

What's Changed

Fixed styling issue in launch screen
tool_name can now be provided in messages with "role": "tool" using the /api/chat endpoint

New Contributors

@vrampal made their first contribution in #9681

Full Changelog: ollama/ollama@v0.9.5...v0.9.6-rc0

`v0.9.5`

Compare Source

Updates to Ollama for macOS and Windows

A new version of Ollama's macOS and Windows applications are now available. New improvements to the apps will be introduced over the coming releases:

New features

Expose Ollama on the network

Ollama can now be exposed on the network, allowing others to access Ollama on other devices or even over the internet. This is useful for having Ollama running on a powerful Mac, PC or Linux computer while making it accessible to less powerful devices.

Model directory

The directory in which models are stored can now be modified! This allows models to be stored on external hard disks or alternative directories than the default.

Smaller footprint and faster starting on macOS

The macOS app is now a native application and starts much faster while requiring a much smaller installation.

Additional changes in 0.9.5

Fixed issue where the ollama CLI would not be installed by Ollama on macOS on startup
Fixed issue where files in ollama-darwin.tgz were not notarized
Add NativeMind to Community Integrations by @xukecheng in #11242
Ollama for macOS now requires version 12 (Monterey) or newer

New Contributors

@xukecheng made their first contribution in #11242

`v0.9.4`

Compare Source

Updates to Ollama for macOS and Windows

A new version of Ollama's macOS and Windows applications are now available. New improvements to the apps will be introduced over the coming releases:

New features

Expose Ollama on the network

Ollama can now be exposed on the network, allowing others to access Ollama on other devices or even over the internet. This is useful for having Ollama running on a powerful Mac, PC or Linux computer while making it accessible to less powerful devices.

Model directory

The directory in which models are stored can now be modified! This allows models to be stored on external hard disks or alternative directories than the default.

Smaller footprint and faster starting on macOS

The macOS app is now a native application and starts much faster while requiring a much smaller installation.

What's Changed

Reduced download size and startup time for Ollama on macOS
Tool calling with empty parameters will now work correctly
Fixed issue when quantizing models with the Gemma 3n architecture
Ollama for macOS should not longer ask for root privileges when updating unless required
Ollama for macOS now requires version 12 (Monterey) or newer

Full Changelog: ollama/ollama@v0.9.3...v0.9.4

`v0.9.3`

Compare Source

Gemma 3n

Ollama now supports Gemma 3n.

Gemma 3n models are designed for efficient execution on everyday devices such as laptops, tablets or phones. These models were trained with data in over 140 spoken languages.

Effective 2B

ollama run gemma3n:e2b

Effective 4B

ollama run gemma3n:e4b

What's Changed

Fixed issue where errors would not be properly reported on Apple Silicon Macs
Ollama will now limit context length to what the model was trained against to avoid strange overflow behavior

New Contributors

@Aj-Seven made their first contribution in #11169

Full Changelog: ollama/ollama@v0.9.2...v0.9.3

`v0.9.2`

Compare Source

What's Changed

Fixed issue where tool calls without parameters would not be returned correctly
Fixed does not support generate errors
Fixed issue where some special tokens would not be tokenized properly for some model architectures

New Contributors

@NGC13009 made their first contribution in #11080

Full Changelog: ollama/ollama@v0.9.1...v0.9.2

`v0.9.1`

Compare Source

Tool calling improvements

New tool calling support

The following models now support tool calling:

DeepSeek-R1-2508 (671B model)
Magistral

Tool calling reliability has also been improved for the following models:

To re-download the models, use ollama pull.

New Ollama for macOS and Windows preview

A new version of Ollama's macOS and Windows applications are available to test for early feedback. New improvements to the apps will be introduced over the coming releases:

If you have feedback, please create an issue on GitHub with the app label. These apps will automatically update themselves to future versions of Ollama, so you may have to redownload new preview versions in the future.

New features

Expose Ollama on the network

Ollama can now be exposed on the network, allowing others to access Ollama on other devices or even over the internet. This is useful for having Ollama running on a powerful Mac, PC or Linux computer while making it accessible to less powerful devices.

Allow local browser access

Enabling this allows websites to access your local installation of Ollama. This is handy for developing browser-based applications using Ollama's JavaScript library.

Model directory

The directory in which models are stored can now be modified! This allows models to be stored on external hard disks or alternative directories than the default.

Smaller footprint and faster starting on macOS

The macOS app is now a native application and starts much faster while requiring a much smaller installation.

What's Changed

Magistral now supports disabling thinking mode. Note: it is also recommended to change the system prompt when doing so.
Error messages that previously showed POST predict will now be more informative
Improved tool calling reliability for some models
Fixed issue on Windows where ollama run would not start Ollama automatically

New Contributors

@JasonHonKL made their first contribution in #10174
@hwittenborn made their first contribution in #10998
@krzysztofjeziorny made their first contribution in #10973

Full Changelog: ollama/ollama@v0.9.0...v0.9.1

`v0.9.0`

Compare Source

New models

DeepSeek-R1-2508: DeepSeek-R1 has received a minor version upgrade to DeepSeek-R1-0528 for the 8 billion parameter distilled model and the full 671 billion parameter model. In this update, DeepSeek R1 has significantly improved its reasoning and inference capabilities.

Thinking

Ollama now has the ability to enable or disable thinking. This gives users the flexibility to choose the model’s thinking behavior for different applications and use cases.

When thinking is enabled, the output will separate the model’s thinking from the model’s output. When thinking is disabled, the model will not think and directly output the content.

Models that support thinking:

DeepSeek R1
Qwen 3
more will be added under thinking models.

When running a model that supports thinking, Ollama will now display the model's thoughts:

% ollama run deepseek-r1
>>> How many Rs are in strawberry
Thinking...
First, I need to understand what the question is asking. It's asking how many letters 'R' are present in the word "strawberry."

Next, I'll examine each letter in the word individually.

I'll start from the beginning and count every occurrence of the letter 'R.'

After reviewing all the letters, I determine that there are three instances where the letter 'R' appears in the word "strawberry."
...done thinking.

There are three **Rs** in the word **"strawberry"**.

In Ollama's API, a model's thinking is now returned as a separate thinking field for easy parsing:

{
  "message": {
    "role": "assistant",
    "thinking": "First, I need to understand what the question is asking. It's asking how many letters 'R' are present in the word "strawberry...",
    "content": "There are **3** instances of the letter **R** in the word **"strawberry."**"
  }
}

Turning thinking on and off

In the API, thinking can be enabled

Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.

If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

renovate · 2024-11-15T22:19:32Z

ℹ Artifact update notice

File name: go.mod

In order to perform the update(s) described in the table above, Renovate ran the go get command, which resulted in the following additional change(s):

6 additional dependencies were updated
The go directive was updated for compatibility reasons

Details:

Package	Change
`go`	`1.23.0` -> `1.24.1`
`golang.org/x/crypto`	`v0.26.0` -> `v0.33.0`
`golang.org/x/exp`	`v0.0.0-20240808152545-0cdaa3abc0fa` -> `v0.0.0-20250218142911-aa4b98e5adaa`
`golang.org/x/sync`	`v0.8.0` -> `v0.11.0`
`golang.org/x/net`	`v0.27.0` -> `v0.35.0`
`golang.org/x/sys`	`v0.23.0` -> `v0.30.0`
`golang.org/x/text`	`v0.17.0` -> `v0.22.0`

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 71e1230 to 4bf37c1 Compare August 31, 2024 20:07

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.3.8~~ fix(deps): update module github.com/ollama/ollama to v0.3.9 Aug 31, 2024

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 4bf37c1 to 3937772 Compare September 8, 2024 09:37

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.3.9~~ fix(deps): update module github.com/ollama/ollama to v0.3.10 Sep 8, 2024

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 3937772 to 4addd23 Compare September 18, 2024 08:12

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.3.10~~ fix(deps): update module github.com/ollama/ollama to v0.3.11 Sep 18, 2024

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 4addd23 to 766eb34 Compare September 25, 2024 07:22

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.3.11~~ fix(deps): update module github.com/ollama/ollama to v0.3.12 Sep 25, 2024

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 766eb34 to fb45a8c Compare October 11, 2024 01:49

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.3.12~~ fix(deps): update module github.com/ollama/ollama to v0.3.13 Oct 11, 2024

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.3.13~~ fix(deps): update module github.com/ollama/ollama to v0.3.14 Oct 21, 2024

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from fb45a8c to c081f43 Compare October 21, 2024 03:49

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from c081f43 to 65d7b07 Compare November 6, 2024 17:22

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.3.14~~ fix(deps): update module github.com/ollama/ollama to v0.4.0 Nov 6, 2024

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 65d7b07 to 01d0591 Compare November 9, 2024 01:14

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.4.0~~ fix(deps): update module github.com/ollama/ollama to v0.4.1 Nov 9, 2024

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.4.1~~ fix(deps): update module github.com/ollama/ollama to v0.4.2 Nov 15, 2024

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 01d0591 to 3104147 Compare November 15, 2024 22:19

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 3104147 to c626610 Compare November 21, 2024 19:44

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.4.2~~ fix(deps): update module github.com/ollama/ollama to v0.4.3 Nov 21, 2024

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from c626610 to ede6e14 Compare November 23, 2024 00:44

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.4.3~~ fix(deps): update module github.com/ollama/ollama to v0.4.4 Nov 23, 2024

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from ede6e14 to 1d65f7d Compare November 26, 2024 01:26

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.4.4~~ fix(deps): update module github.com/ollama/ollama to v0.4.5 Nov 26, 2024

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 1d65f7d to 7d6c10a Compare November 28, 2024 01:51

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.4.5~~ fix(deps): update module github.com/ollama/ollama to v0.4.6 Nov 28, 2024

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 7d6c10a to 73270f8 Compare November 30, 2024 23:02

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.4.6~~ fix(deps): update module github.com/ollama/ollama to v0.4.7 Nov 30, 2024

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 73270f8 to 53544b1 Compare December 6, 2024 02:29

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.12.1~~ fix(deps): update module github.com/ollama/ollama to v0.12.2 Sep 25, 2025

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 45f5c3c to f6c7b08 Compare September 26, 2025 02:04

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.12.2~~ fix(deps): update module github.com/ollama/ollama to v0.12.3 Sep 26, 2025

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from f6c7b08 to 4ca909b Compare October 10, 2025 01:48

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.12.3~~ fix(deps): update module github.com/ollama/ollama to v0.12.4 Oct 10, 2025

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 4ca909b to 364eca0 Compare October 10, 2025 21:38

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.12.4~~ fix(deps): update module github.com/ollama/ollama to v0.12.5 Oct 10, 2025

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 364eca0 to ac548bb Compare October 16, 2025 23:06

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.12.5~~ fix(deps): update module github.com/ollama/ollama to v0.12.6 Oct 16, 2025

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from ac548bb to 43a2475 Compare October 30, 2025 00:11

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.12.6~~ fix(deps): update module github.com/ollama/ollama to v0.12.7 Oct 30, 2025

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 43a2475 to e77fe56 Compare October 31, 2025 05:42

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.12.7~~ fix(deps): update module github.com/ollama/ollama to v0.12.8 Oct 31, 2025

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from e77fe56 to 45f39dd Compare November 1, 2025 08:54

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.12.8~~ fix(deps): update module github.com/ollama/ollama to v0.12.9 Nov 1, 2025

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 45f39dd to 9dabc95 Compare November 6, 2025 21:05

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.12.9~~ fix(deps): update module github.com/ollama/ollama to v0.12.10 Nov 6, 2025

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 9dabc95 to 87ad9b5 Compare November 14, 2025 02:08

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.12.10~~ fix(deps): update module github.com/ollama/ollama to v0.12.11 Nov 14, 2025

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 87ad9b5 to 7da1e99 Compare November 20, 2025 00:31

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.12.11~~ fix(deps): update module github.com/ollama/ollama to v0.13.0 Nov 20, 2025

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 7da1e99 to 8e749ce Compare December 2, 2025 21:40

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.13.0~~ fix(deps): update module github.com/ollama/ollama to v0.13.1 Dec 2, 2025

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 8e749ce to af598a0 Compare December 8, 2025 19:11

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.13.1~~ fix(deps): update module github.com/ollama/ollama to v0.13.2 Dec 8, 2025

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from af598a0 to 28a4562 Compare December 12, 2025 03:53

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.13.2~~ fix(deps): update module github.com/ollama/ollama to v0.13.3 Dec 12, 2025

fix(deps): update module github.com/ollama/ollama to v0.13.4

73e657a

renovate bot force-pushed the renovate/github.com-ollama-ollama-0.x branch from 28a4562 to 73e657a Compare December 16, 2025 05:38

renovate bot changed the title ~~fix(deps): update module github.com/ollama/ollama to v0.13.3~~ fix(deps): update module github.com/ollama/ollama to v0.13.4 Dec 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(deps): update module github.com/ollama/ollama to v0.13.4 #66

fix(deps): update module github.com/ollama/ollama to v0.13.4 #66

Uh oh!

renovate bot commented Aug 28, 2024 •

edited

Loading

Uh oh!

renovate bot commented Nov 15, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix(deps): update module github.com/ollama/ollama to v0.13.4 #66

Are you sure you want to change the base?

fix(deps): update module github.com/ollama/ollama to v0.13.4 #66

Uh oh!

Conversation

renovate bot commented Aug 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Release Notes

New Models

What's Changed

New Contributors

New models

What's Changed

New models

What's Changed

New Contributors

New models

What's Changed

New Contributors

New models

DeepSeek-OCR

New bench tool

What's Changed

New Contributors

Logprobs

top_logprobs

Special thanks

Vulkan support (opt-in)

What's Changed

New Contributors

ollama run now works with embedding models

What's Changed

New Contributors

What's Changed

What's Changed

New Contributors

New models

Add files and adjust thinking levels in Ollama's new app

New API documentation

What's Changed

New Contributors

What's Changed

Experimental Vulkan Support

New Contributors

What's Changed

New Contributors

What's Changed

New Contributors

New models

What's Changed

New Contributors

Web search

What's Changed

New models

What's Changed

Cloud models

What's Changed

New Contributors

What's Changed

New Contributors

New models

What's Changed

What's Changed

New Contributors

What's Changed

DeepSeek-V3.1

Turbo (in preview)

Turbo via Ollama's app

Turbo via Ollama's CLI and libraries

What's Changed

New Contributors

What's Changed

What's Changed

New Contributors

What's Changed

New Contributors

What's Changed

What's Changed

Welcome OpenAI's gpt-oss models

Feature highlights

renovate bot commented Aug 28, 2024 •

edited

Loading

New `bench` tool

`top_logprobs`

`ollama run` now works with embedding models