-
Notifications
You must be signed in to change notification settings - Fork 17.6k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
docs : update speculative decoding parameters after refactor (#22397)
documentation
Improvements or additions to documentation
server : fix image blocks in tool_result being dropped during Anthropic→OpenAI conversion
examples
server
#22536
opened Apr 30, 2026 by
quei4r
Loading…
2 tasks done
ci : bump ty to 0.0.33
devops
improvements to build systems and github actions
python
python script changes
script
Script related
#22535
opened Apr 30, 2026 by
CISC
Member
Loading…
fix: CUDA device PCI bus ID de-dupe OOMing (ignoring other 3 gpus entirely)
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#22533
opened Apr 29, 2026 by
lucyknada
Loading…
ggml-rpc: serialize send_rpc_cmd per socket to fix concurrent-send race
ggml
changes relating to the ggml tensor library for machine learning
#22530
opened Apr 29, 2026 by
dingleberry61
Loading…
[Model] Support MiniCPM-V 4.6
documentation
Improvements or additions to documentation
examples
python
python script changes
#22529
opened Apr 29, 2026 by
tc-mb
Contributor
Loading…
[mtmd] Add PaliGemma2 support (SigLIP + Gemma2 backbone)
examples
python
python script changes
#22528
opened Apr 29, 2026 by
shichiachi3-cyber
Loading…
sycl: Add optional USM system allocations
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#22526
opened Apr 29, 2026 by
ifdu
Loading…
ggml-cpu: optimize ggml_gemm_q4_K_8x8_q8_K interleaving/staging for AVX-512 (and AVX2)
ggml
changes relating to the ggml tensor library for machine learning
#22525
opened Apr 29, 2026 by
HyeongiJeon
Loading…
Programmatic Dependent Launch (PDL) for more performance on newer NVIDIA GPUs (Hopper+)
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
Update build.md with commands for Improvements or additions to documentation
nvidia-smi in Override Compute Capability Specifications
documentation
#22519
opened Apr 29, 2026 by
DoctorD90
Loading…
ggml-metal: implement async 2D tensor copy functions
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
vulkan: add get/set tensor 2d functions
AMD ZenDNN
Issues related to the AMD ZenDNN backend
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
Ascend NPU
issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
Hexagon
IBM zDNN
issues specific to IBM zDNN Accelerator
Nvidia GPU
Issues specific to Nvidia GPUs
OpenCL
Issues specific to the OpenCL backend
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
Vulkan
Issues specific to the Vulkan backend
WebGPU
#22514
opened Apr 29, 2026 by
0cc4m
Contributor
Loading…
nix: added dev shells for more backends and updated flake.lock
devops
improvements to build systems and github actions
nix
Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment
#22509
opened Apr 29, 2026 by
gpayer
Loading…
Stop qwen3.6 from outputting empty <think> blocks
#22507
opened Apr 29, 2026 by
michaelw9999
Contributor
Loading…
ggml-cpu: add RVV implementation for q1_0 x q8_0 vec dot
ggml
changes relating to the ggml tensor library for machine learning
#22500
opened Apr 29, 2026 by
velonica0
Loading…
Update llama-mmap to work with 32-bit emscripten
#22497
opened Apr 29, 2026 by
reeselevine
Contributor
Loading…
examples : add llama-profiler-cpu/gpu for op roofline measurement
examples
#22495
opened Apr 29, 2026 by
aukarande
Loading…
feat: Add Mimo v2.5 model support
model
Model specific
python
python script changes
#22493
opened Apr 29, 2026 by
AesSedai
Contributor
Loading…
gguf-py: shrink layers or embedding vectors for reducing model size.
python
python script changes
#22485
opened Apr 28, 2026 by
tiehexue
Loading…
cmake: Assign the include path for ggml.h to the ggml::ggml target
ggml
changes relating to the ggml tensor library for machine learning
#22482
opened Apr 28, 2026 by
SchaichAlonso
•
Draft
server: include api_prefix in public_endpoints set
examples
server
#22475
opened Apr 28, 2026 by
AlexAlDantas
Loading…
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.