-
Notifications
You must be signed in to change notification settings - Fork 37
fix(deps): update module github.com/ollama/ollama to v0.13.4 #66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
renovate
wants to merge
1
commit into
main
Choose a base branch
from
renovate/github.com-ollama-ollama-0.x
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
71e1230 to
4bf37c1
Compare
4bf37c1 to
3937772
Compare
3937772 to
4addd23
Compare
4addd23 to
766eb34
Compare
766eb34 to
fb45a8c
Compare
fb45a8c to
c081f43
Compare
c081f43 to
65d7b07
Compare
65d7b07 to
01d0591
Compare
01d0591 to
3104147
Compare
Contributor
Author
ℹ Artifact update noticeFile name: go.modIn order to perform the update(s) described in the table above, Renovate ran the
Details:
|
3104147 to
c626610
Compare
c626610 to
ede6e14
Compare
ede6e14 to
1d65f7d
Compare
1d65f7d to
7d6c10a
Compare
7d6c10a to
73270f8
Compare
73270f8 to
53544b1
Compare
45f5c3c to
f6c7b08
Compare
f6c7b08 to
4ca909b
Compare
4ca909b to
364eca0
Compare
364eca0 to
ac548bb
Compare
ac548bb to
43a2475
Compare
43a2475 to
e77fe56
Compare
e77fe56 to
45f39dd
Compare
45f39dd to
9dabc95
Compare
9dabc95 to
87ad9b5
Compare
87ad9b5 to
7da1e99
Compare
7da1e99 to
8e749ce
Compare
8e749ce to
af598a0
Compare
af598a0 to
28a4562
Compare
28a4562 to
73e657a
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
v0.3.6->v0.13.4Release Notes
ollama/ollama (github.com/ollama/ollama)
v0.13.4Compare Source
New Models
What's Changed
New Contributors
Full Changelog: ollama/ollama@v0.13.3...v0.13.4-rc0
v0.13.3Compare Source
New models
What's Changed
/api/embedand/v1/embeddingsFull Changelog: ollama/ollama@v0.13.2...v0.13.3
v0.13.2Compare Source
New models
What's Changed
mistral-3,gemma3,qwen3-vland more. This improves memory utilization and performance when providing images as input.deepseek-v3.1would always think even with thinking is disabled in Ollama's appNew Contributors
Full Changelog: ollama/ollama@v0.13.1...v0.13.2
v0.13.1Compare Source
New models
What's Changed
nomic-embed-textwill now use Ollama's engine by defaultcogito-v2.1Unmarshal:errorsNew Contributors
Full Changelog: ollama/ollama@v0.13.0...v0.13.1
v0.13.0Compare Source
New models
DeepSeek-OCR
DeepSeek-OCR is now available on Ollama. Example inputs:
New
benchtoolOllama's GitHub repo now includes a
benchtool that can be used to test model performance. For the time being this is a separate tool that can be built in the Ollama GitHub repository:First, install Go. Then from the root of the Ollama repository run:
For more information see the tool's documentation
What's Changed
New Contributors
Full Changelog: ollama/ollama@v0.12.11...v0.13.0
v0.12.11Compare Source
Logprobs
Ollama's API and OpenAI-compatible API now support log probabilities. Log probabilities of output tokens indicate the likelihood of each token occurring in the sequence given the context. This is useful for different use cases:
To enable Logprobs, provide
"logprobs": trueto Ollama's API:When log probabilities are requested, response chunks will now include a
"logprobs"field with the token, log probability and raw bytes (for partial unicode).{ "model": "gemma3", "created_at": "2025-11-14T22:17:56.598562Z", "response": "Okay", "done": false, "logprobs": [ { "token": "Okay", "logprob": -1.3434503078460693, "bytes": [ 79, 107, 97, 121 ] } ] }top_logprobsWhen setting
"top_logprobs", a number of most-likely tokens are also provided, making it possible to introspect alternative tokens. Below is an example request.This will generate a stream of response chunks with the following fields:
{ "model": "gemma3", "created_at": "2025-11-14T22:26:10.466324Z", "response": "The", "done": false, "logprobs": [ { "token": "The", "logprob": -0.8361086845397949, "bytes": [ 84, 104, 101 ], "top_logprobs": [ { "token": "The", "logprob": -0.8361086845397949, "bytes": [ 84, 104, 101 ] }, { "token": "Okay", "logprob": -1.2590975761413574, "bytes": [ 79, 107, 97, 121 ] }, { "token": "That", "logprob": -1.2686877250671387, "bytes": [ 84, 104, 97, 116 ] } ] } ] }Special thanks
Thank you @baptistejamin for adding Logprobs to Ollama's API.
Vulkan support (opt-in)
Ollama 0.12.11 includes support for Vulkan acceleration. Vulkan brings support for a broad range of GPUs from AMD, Intel, and iGPUs. Vulkan support is not yet enabled by default, and requires opting in by running Ollama with a custom environment variable:
On Powershell, use:
For issues or feedback on using Vulkan with Ollama, create an issue labelled Vulkan and make sure to include server logs where possible to aid in debugging.
What's Changed
"required"field in tool definitions will now be omitted if not specified"tool_call_id"would be omitted when using the OpenAI-compatible API.ollama createwould import data from bothconsolidated.safetensorsand other safetensor files.OLLAMA_VULKAN=1. For example:OLLAMA_VULKAN=1 ollama serveNew Contributors
Full Changelog: ollama/ollama@v0.12.10...v0.12.11
v0.12.10Compare Source
ollama runnow works with embedding modelsollama runcan now run embedding models to generate vector embeddings from text:Content can also be provided to
ollama runvia standard input:What's Changed
qwen3-vl:235bandqwen3-vl:235b-instruct/api/chatAPIollama runnow works with embedding modelsNew Contributors
Full Changelog: ollama/ollama@v0.12.9...v0.12.10
v0.12.9Compare Source
What's Changed
Full Changelog: ollama/ollama@v0.12.8...v0.12.9
v0.12.8Compare Source
What's Changed
qwen3-vlperformance improvements, including flash attention support by defaultqwen3-vlwill now output less leading whitespace in the response when thinkingdeepseek-v3.1thinking could not be disabled in Ollama's new appqwen3-vlwould fail to interpret images with transparent backgroundsollama rmNew Contributors
Full Changelog: ollama/ollama@v0.12.7...v0.12.8
v0.12.7Compare Source
New models
Add files and adjust thinking levels in Ollama's new app
Ollama's new app now includes a way to add one or many files when prompting the model:
For better responses, thinking levels can now be adjusted for the gpt-oss models:
New API documentation
New API documentation is available for Ollama's API: https://docs.ollama.com/api
What's Changed
embeddinggemma/v1/embeddingsendpoint now supportsencoding_formatparameter{"name": name, "arguments": args}(thanks @rick-github!)FROM <model>would not inheritRENDERERorPARSERcommandsNew Contributors
Full Changelog: ollama/ollama@v0.12.6...v0.12.7
v0.12.6Compare Source
What's Changed
qwen3-coderwould act in raw mode when using/api/generateorollama run qwen3-coder <prompt>qwen3-embeddingproviding invalid resultsnum_gpuis settool_indexwith a value of0would not be sent to the modelExperimental Vulkan Support
Experimental support for Vulkan is now available when you build locally from source. This will enable additional GPUs from AMD, and Intel which are not currently supported by Ollama. To build locally, install the Vulkan SDK and set VULKAN_SDK in your environment, then follow the developer instructions. In a future release, Vulkan support will be included in the binary release as well. Please file issues if you run into any problems.
New Contributors
Full Changelog: ollama/ollama@v0.12.5...v0.12.6
v0.12.5Compare Source
What's Changed
/api/chatAPI"think": falsewould show an error instead of being silently ignoreddeepseek-r1output issuesNew Contributors
Full Changelog: ollama/ollama@v0.12.4...v0.12.5-rc0
v0.12.4Compare Source
What's Changed
keep_alivein the API would accept different values for the/api/chatand/api/generateendpointsqwen3-coderOLLAMA_FLASH_ATTENTIONcan now be overridden to0for models that have flash attention enabled by defaultNew Contributors
Full Changelog: ollama/ollama@v0.12.3...v0.12.4-rc3
v0.12.3Compare Source
New models
DeepSeek-V3.1-Terminus: DeepSeek-V3.1-Terminus is a hybrid model that supports both thinking mode and non-thinking mode. It delivers more stable & reliable outputs across benchmarks compared to the previous version:
Run on Ollama's cloud:
Run locally (requires 500GB+ of VRAM)
Kimi-K2-Instruct-0905: Kimi K2-Instruct-0905 is the latest, most capable version of Kimi K2. It is a state-of-the-art mixture-of-experts (MoE) language model, featuring 32 billion activated parameters and a total of 1 trillion parameters.
What's Changed
ollama pushwill now provide a URL to follow to sign in/loadwould crashNew Contributors
Full Changelog: ollama/ollama@v0.12.2...v0.12.3
v0.12.2Compare Source
Web search
A new web search API is now available in Ollama. Ollama provides a generous free tier of web searches for individuals to use, and higher rate limits are available via Ollama’s cloud. This web search capability can augment models with the latest information from the web to reduce hallucinations and improve accuracy.
What's Changed
Full Changelog: ollama/ollama@v0.12.1...v0.12.2
v0.12.1Compare Source
New models
What's Changed
&characters in Qwen3-Coder would not be parsed correctly when function callingollama signinwould not work properly on LinuxFull Changelog: ollama/ollama@v0.12.0...v0.12.1
v0.12.0Compare Source
Cloud models
Cloud models are now available in preview, allowing you to run a group of larger models with fast, datacenter-grade hardware.
To run a cloud model, use:
What's Changed
ollama create<think>if provided in the prompt for /api/generate by @rick-githubNew Contributors
Full Changelog: ollama/ollama@v0.11.11...v0.12.0
v0.11.11Compare Source
What's Changed
dimensionsfield to embed requests by @mxyng in #12242New Contributors
Full Changelog: ollama/ollama@v0.11.10...v0.11.11
v0.11.10Compare Source
New models
What's Changed
Full Changelog: ollama/ollama@v0.11.9...v0.11.10
v0.11.9Compare Source
What's Changed
New Contributors
Full Changelog: ollama/ollama@v0.11.8...v0.11.9-rc0
v0.11.8Compare Source
What's Changed
gpt-ossnow has flash attention enabled by default for systems that support itgpt-ossFull Changelog: ollama/ollama@v0.11.7...v0.11.8
v0.11.7Compare Source
DeepSeek-V3.1
DeepSeek-V3.1 is now available to run via Ollama.
This model supports hybrid thinking, meaning thinking can be enabled or disabled by setting
thinkin Ollama's API:In Ollama's CLI, thinking can be enabled or disabled by running the
/set thinkor/set nothinkcommands.Turbo (in preview)
DeepSeek-V3.1 has over 671B parameters, and so a large amount of VRAM is required to run it. Ollama's Turbo mode (in preview) provides access to powerful hardware in the cloud you can use to run the model.
Turbo via Ollama's app
deepseek-v3.1:671bfrom the model selectorTurbo via Ollama's CLI and libraries
For instructions on using Turbo with Ollama's Python and JavaScript library, see the docs
What's Changed
<think>tag (e.g. DeepSeek-V3.1)<think>tag from a model{or}would not be parsed correctlyNew Contributors
Full Changelog: ollama/ollama@v0.11.6...v0.11.7
v0.11.6Compare Source
What's Changed
Full Changelog: ollama/ollama@v0.11.5...v0.11.6
v0.11.5Compare Source
What's Changed
gpt-ossmodelsOLLAMA_NEW_ESTIMATES=1 ollama serveand will soon be enabled by default.OLLAMA_FLASH_ATTENTION=1will also enable flash attention for pure-CPU modelsreasoning_effortNew Contributors
Full Changelog: ollama/ollama@v0.11.4...v0.11.5
v0.11.4Compare Source
What's Changed
New Contributors
Full Changelog: ollama/ollama@v0.11.3...v0.11.4
v0.11.3Compare Source
What's Changed
gpt-osswould consume too much VRAM when split across GPU & CPU or multiple GPUsFull Changelog: ollama/ollama@v0.11.2...v0.11.3
v0.11.2Compare Source
What's Changed
Full Changelog: ollama/ollama@v0.11.1...v0.11.2
v0.11.1Compare Source
v0.11.0Compare Source
Welcome OpenAI's gpt-oss models
Ollama partners with OpenAI to bring its latest state-of-the-art open weight models to Ollama. The two models, 20B and 120B, bring a whole new local chat experience, and are designed for powerful reasoning, agentic tasks, and versatile developer use cases.
Feature highlights
Quantization - MXFP4 format
OpenAI utilizes quantization to reduce the memory footprint of the gpt-oss models. The models are post-trained with quantization of the mixture-of-experts (MoE) weights to MXFP4 format, where the weights are quantized to 4.25 bits per parameter. The MoE weights are responsible for 90+% of the total parameter count, and quantizing these to MXFP4 enables the smaller model to run on systems with as little as 16GB memory, and the larger model to fit on a single 80GB GPU.
Ollama is supporting the MXFP4 format natively without additional quantizations or conversions. New kernels are developed for Ollama’s new engine to support the MXFP4 format.
Ollama collaborated with OpenAI to benchmark against their reference implementations to ensure Ollama’s implementations have the same quality.
Get started
You can get started by downloading the latest Ollama version (v0.11)
The model can be downloaded directly in Ollama’s new app or via the terminal:
ollama run gpt-oss:20bollama run gpt-oss:120bWhat's Changed
Full Changelog: ollama/ollama@v0.10.1...v0.11.0
v0.10.1Compare Source
What's Changed
ollama serveNew Contributors
Full Changelog: ollama/ollama@v0.10.0...v0.10.1
v0.10.0Compare Source
Ollama's new app
Ollama's new app is available for macOS and Windows: Download Ollama
What's Changed
ollama pswill now show the context length of loaded modelsgemma3nmodels by 2-3xgranite3.3andmistral-nemomodelsaddandget_addressollama showwould report an errorollama runwill more gracefully display errorsNew Contributors
Full Changelog: ollama/ollama@v0.9.6...v0.10.0
v0.9.6Compare Source
What's Changed
tool_namecan now be provided in messages with"role": "tool"using the/api/chatendpointNew Contributors
Full Changelog: ollama/ollama@v0.9.5...v0.9.6-rc0
v0.9.5Compare Source
Updates to Ollama for macOS and Windows
A new version of Ollama's macOS and Windows applications are now available. New improvements to the apps will be introduced over the coming releases:
New features
Expose Ollama on the network
Ollama can now be exposed on the network, allowing others to access Ollama on other devices or even over the internet. This is useful for having Ollama running on a powerful Mac, PC or Linux computer while making it accessible to less powerful devices.
Model directory
The directory in which models are stored can now be modified! This allows models to be stored on external hard disks or alternative directories than the default.
Smaller footprint and faster starting on macOS
The macOS app is now a native application and starts much faster while requiring a much smaller installation.
Additional changes in 0.9.5
ollamaCLI would not be installed by Ollama on macOS on startupollama-darwin.tgzwere not notarizedNew Contributors
v0.9.4Compare Source
Updates to Ollama for macOS and Windows
A new version of Ollama's macOS and Windows applications are now available. New improvements to the apps will be introduced over the coming releases:
New features
Expose Ollama on the network
Ollama can now be exposed on the network, allowing others to access Ollama on other devices or even over the internet. This is useful for having Ollama running on a powerful Mac, PC or Linux computer while making it accessible to less powerful devices.
Model directory
The directory in which models are stored can now be modified! This allows models to be stored on external hard disks or alternative directories than the default.
Smaller footprint and faster starting on macOS
The macOS app is now a native application and starts much faster while requiring a much smaller installation.
What's Changed
Full Changelog: ollama/ollama@v0.9.3...v0.9.4
v0.9.3Compare Source
Gemma 3n
Ollama now supports Gemma 3n.
Gemma 3n models are designed for efficient execution on everyday devices such as laptops, tablets or phones. These models were trained with data in over 140 spoken languages.
Effective 2B
Effective 4B
What's Changed
New Contributors
Full Changelog: ollama/ollama@v0.9.2...v0.9.3
v0.9.2Compare Source
What's Changed
does not support generateerrorsNew Contributors
Full Changelog: ollama/ollama@v0.9.1...v0.9.2
v0.9.1Compare Source
Tool calling improvements
New tool calling support
The following models now support tool calling:
Tool calling reliability has also been improved for the following models:
To re-download the models, use
ollama pull.New Ollama for macOS and Windows preview
A new version of Ollama's macOS and Windows applications are available to test for early feedback. New improvements to the apps will be introduced over the coming releases:
If you have feedback, please create an issue on GitHub with the
applabel. These apps will automatically update themselves to future versions of Ollama, so you may have to redownload new preview versions in the future.New features
Expose Ollama on the network
Ollama can now be exposed on the network, allowing others to access Ollama on other devices or even over the internet. This is useful for having Ollama running on a powerful Mac, PC or Linux computer while making it accessible to less powerful devices.
Allow local browser access
Enabling this allows websites to access your local installation of Ollama. This is handy for developing browser-based applications using Ollama's JavaScript library.
Model directory
The directory in which models are stored can now be modified! This allows models to be stored on external hard disks or alternative directories than the default.
Smaller footprint and faster starting on macOS
The macOS app is now a native application and starts much faster while requiring a much smaller installation.
What's Changed
POST predictwill now be more informativeollama runwould not start Ollama automaticallyNew Contributors
Full Changelog: ollama/ollama@v0.9.0...v0.9.1
v0.9.0Compare Source
New models
Thinking
Ollama now has the ability to enable or disable thinking. This gives users the flexibility to choose the model’s thinking behavior for different applications and use cases.
When thinking is enabled, the output will separate the model’s thinking from the model’s output. When thinking is disabled, the model will not think and directly output the content.
Models that support thinking:
When running a model that supports thinking, Ollama will now display the model's thoughts:
In Ollama's API, a model's thinking is now returned as a separate
thinkingfield for easy parsing:Turning thinking on and off
In the API, thinking can be enabled
Configuration
📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR was generated by Mend Renovate. View the repository job log.