Skip to content

Releases: Thireus/llama.cpp

b9503

17 Apr 11:11
47d8954

Choose a tag to compare

Merge branch 'ggml-org:master' into master

⚠️ CUDA Build Notice

CUDA builds have been downgraded to CUDA 13.1 due to an NVIDIA bug that affects certain quantized models when binaries are compiled with CUDA 13.2.

  • This issue only impacts llama.cpp binaries built with CUDA 13.2 (e.g. previous releases).
  • Your installed CUDA 13.2 drivers are not affected — no downgrade is needed.
  • NVIDIA is currently working on a fix.

Recommended workaround: Use binaries compiled with CUDA 12.8 or CUDA 13.1 for now.

Note: ik_llama.cpp is also affected by this issue.
Read more about it here: Thireus/GGUF-Tool-Suite#71

For ref. CUDA 12.8 supports Maxwell (5.0) to Hopper (9.0), while CUDA 13.1 supports Turing (7.5) to Blackwell (12.1) microarchitectures.

macOS/iOS:

Linux:

Windows:

openEuler:

b9496

17 Apr 04:16
ccaae8d

Choose a tag to compare

Merge branch 'ggml-org:master' into master

⚠️ CUDA Build Notice

CUDA builds have been downgraded to CUDA 13.1 due to an NVIDIA bug that affects certain quantized models when binaries are compiled with CUDA 13.2.

  • This issue only impacts llama.cpp binaries built with CUDA 13.2 (e.g. previous releases).
  • Your installed CUDA 13.2 drivers are not affected — no downgrade is needed.
  • NVIDIA is currently working on a fix.

Recommended workaround: Use binaries compiled with CUDA 12.8 or CUDA 13.1 for now.

Note: ik_llama.cpp is also affected by this issue.
Read more about it here: Thireus/GGUF-Tool-Suite#71

For ref. CUDA 12.8 supports Maxwell (5.0) to Hopper (9.0), while CUDA 13.1 supports Turing (7.5) to Blackwell (12.1) microarchitectures.

macOS/iOS:

Linux:

Windows:

openEuler:

b9483

16 Apr 11:16
a138a5a

Choose a tag to compare

Merge branch 'ggml-org:master' into master

⚠️ CUDA Build Notice

CUDA builds have been downgraded to CUDA 13.1 due to an NVIDIA bug that affects certain quantized models when binaries are compiled with CUDA 13.2.

  • This issue only impacts llama.cpp binaries built with CUDA 13.2 (e.g. previous releases).
  • Your installed CUDA 13.2 drivers are not affected — no downgrade is needed.
  • NVIDIA is currently working on a fix.

Recommended workaround: Use binaries compiled with CUDA 12.8 or CUDA 13.1 for now.

Note: ik_llama.cpp is also affected by this issue.
Read more about it here: Thireus/GGUF-Tool-Suite#71

For ref. CUDA 12.8 supports Maxwell (5.0) to Hopper (9.0), while CUDA 13.1 supports Turing (7.5) to Blackwell (12.1) microarchitectures.

macOS/iOS:

Linux:

Windows:

openEuler:

b9480

16 Apr 02:14
6e6e53f

Choose a tag to compare

Merge branch 'ggml-org:master' into master

⚠️ CUDA Build Notice

CUDA builds have been downgraded to CUDA 13.1 due to an NVIDIA bug that affects certain quantized models when binaries are compiled with CUDA 13.2.

  • This issue only impacts llama.cpp binaries built with CUDA 13.2 (e.g. previous releases).
  • Your installed CUDA 13.2 drivers are not affected — no downgrade is needed.
  • NVIDIA is currently working on a fix.

Recommended workaround: Use binaries compiled with CUDA 12.8 or CUDA 13.1 for now.

Note: ik_llama.cpp is also affected by this issue.
Read more about it here: Thireus/GGUF-Tool-Suite#71

For ref. CUDA 12.8 supports Maxwell (5.0) to Hopper (9.0), while CUDA 13.1 supports Turing (7.5) to Blackwell (12.1) microarchitectures.

macOS/iOS:

Linux:

Windows:

openEuler:

b9476

15 Apr 19:08
cf0ee27

Choose a tag to compare

Merge branch 'ggml-org:master' into master

⚠️ CUDA Build Notice

CUDA builds have been downgraded to CUDA 13.1 due to an NVIDIA bug that affects certain quantized models when binaries are compiled with CUDA 13.2.

  • This issue only impacts llama.cpp binaries built with CUDA 13.2 (e.g. previous releases).
  • Your installed CUDA 13.2 drivers are not affected — no downgrade is needed.
  • NVIDIA is currently working on a fix.

Recommended workaround: Use binaries compiled with CUDA 12.8 or CUDA 13.1 for now.

Note: ik_llama.cpp is also affected by this issue.
Read more about it here: Thireus/GGUF-Tool-Suite#71

For ref. CUDA 12.8 supports Maxwell (5.0) to Hopper (9.0), while CUDA 13.1 supports Turing (7.5) to Blackwell (12.1) microarchitectures.

macOS/iOS:

Linux:

Windows:

openEuler:

b9468

15 Apr 05:48
d6a5b97

Choose a tag to compare

Merge branch 'ggml-org:master' into master

⚠️ CUDA Build Notice

CUDA builds have been downgraded to CUDA 13.1 due to an NVIDIA bug that affects certain quantized models when binaries are compiled with CUDA 13.2.

  • This issue only impacts llama.cpp binaries built with CUDA 13.2 (e.g. previous releases).
  • Your installed CUDA 13.2 drivers are not affected — no downgrade is needed.
  • NVIDIA is currently working on a fix.

Recommended workaround: Use binaries compiled with CUDA 12.8 or CUDA 13.1 for now.

Note: ik_llama.cpp is also affected by this issue.
Read more about it here: Thireus/GGUF-Tool-Suite#71

For ref. CUDA 12.8 supports Maxwell (5.0) to Hopper (9.0), while CUDA 13.1 supports Turing (7.5) to Blackwell (12.1) microarchitectures.

macOS/iOS:

Linux:

Windows:

openEuler:

b9466

15 Apr 05:14
e48cebf

Choose a tag to compare

Merge branch 'ggml-org:master' into master

⚠️ CUDA Build Notice

CUDA builds have been downgraded to CUDA 13.1 due to an NVIDIA bug that affects certain quantized models when binaries are compiled with CUDA 13.2.

  • This issue only impacts llama.cpp binaries built with CUDA 13.2 (e.g. previous releases).
  • Your installed CUDA 13.2 drivers are not affected — no downgrade is needed.
  • NVIDIA is currently working on a fix.

Recommended workaround: Use binaries compiled with CUDA 12.8 or CUDA 13.1 for now.

Note: ik_llama.cpp is also affected by this issue.
Read more about it here: Thireus/GGUF-Tool-Suite#71

For ref. CUDA 12.8 supports Maxwell (5.0) to Hopper (9.0), while CUDA 13.1 supports Turing (7.5) to Blackwell (12.1) microarchitectures.

macOS/iOS:

Linux:

Windows:

openEuler:

b9449

14 Apr 07:50
fd60ed1

Choose a tag to compare

Merge branch 'ggml-org:master' into master

⚠️ CUDA Build Notice

CUDA builds have been downgraded to CUDA 13.1 due to an NVIDIA bug that affects certain quantized models when binaries are compiled with CUDA 13.2.

  • This issue only impacts llama.cpp binaries built with CUDA 13.2 (e.g. previous releases).
  • Your installed CUDA 13.2 drivers are not affected — no downgrade is needed.
  • NVIDIA is currently working on a fix.

Recommended workaround: Use binaries compiled with CUDA 12.8 or CUDA 13.1 for now.

Note: ik_llama.cpp is also affected by this issue.
Read more about it here: Thireus/GGUF-Tool-Suite#71

For ref. CUDA 12.8 supports Maxwell (5.0) to Hopper (9.0), while CUDA 13.1 supports Turing (7.5) to Blackwell (12.1) microarchitectures.

macOS/iOS:

Linux:

Windows:

openEuler:

b9443

13 Apr 23:58
b1d9d2f

Choose a tag to compare

Merge branch 'ggml-org:master' into master

⚠️ CUDA Build Notice

CUDA builds have been downgraded to CUDA 13.1 due to an NVIDIA bug that affects certain quantized models when binaries are compiled with CUDA 13.2.

  • This issue only impacts llama.cpp binaries built with CUDA 13.2 (e.g. previous releases).
  • Your installed CUDA 13.2 drivers are not affected — no downgrade is needed.
  • NVIDIA is currently working on a fix.

Recommended workaround: Use binaries compiled with CUDA 12.8 or CUDA 13.1 for now.

Note: ik_llama.cpp is also affected by this issue.
Read more about it here: Thireus/GGUF-Tool-Suite#71

For ref. CUDA 12.8 supports Maxwell (5.0) to Hopper (9.0), while CUDA 13.1 supports Turing (7.5) to Blackwell (12.1) microarchitectures.

macOS/iOS:

Linux:

Windows:

openEuler:

b9435

13 Apr 03:05
3122e85

Choose a tag to compare

Merge branch 'ggml-org:master' into master

⚠️ CUDA Build Notice

CUDA builds have been downgraded to CUDA 13.1 due to an NVIDIA bug that affects certain quantized models when binaries are compiled with CUDA 13.2.

  • This issue only impacts llama.cpp binaries built with CUDA 13.2 (e.g. previous releases).
  • Your installed CUDA 13.2 drivers are not affected — no downgrade is needed.
  • NVIDIA is currently working on a fix.

Recommended workaround: Use binaries compiled with CUDA 12.8 or CUDA 13.1 for now.

Note: ik_llama.cpp is also affected by this issue.
Read more about it here: Thireus/GGUF-Tool-Suite#71

For ref. CUDA 12.8 supports Maxwell (5.0) to Hopper (9.0), while CUDA 13.1 supports Turing (7.5) to Blackwell (12.1) microarchitectures.

macOS/iOS:

Linux:

Windows:

openEuler: