Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

docs : update speculative decoding parameters after refactor (#22397) documentation Improvements or additions to documentation
#22539 opened Apr 30, 2026 by ggerganov Member Draft
server : validate --tools CLI argument against known tool names
#22538 opened Apr 30, 2026 by ggerganov Member Draft
1 task done
ci : bump ty to 0.0.33 devops improvements to build systems and github actions python python script changes script Script related
#22535 opened Apr 30, 2026 by CISC Member Loading…
fix: CUDA device PCI bus ID de-dupe OOMing (ignoring other 3 gpus entirely) ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#22533 opened Apr 29, 2026 by lucyknada Loading…
ggml-rpc: serialize send_rpc_cmd per socket to fix concurrent-send race ggml changes relating to the ggml tensor library for machine learning
#22530 opened Apr 29, 2026 by dingleberry61 Loading…
[Model] Support MiniCPM-V 4.6 documentation Improvements or additions to documentation examples python python script changes
#22529 opened Apr 29, 2026 by tc-mb Contributor Loading…
[mtmd] Add PaliGemma2 support (SigLIP + Gemma2 backbone) examples python python script changes
#22528 opened Apr 29, 2026 by shichiachi3-cyber Loading…
sycl: Add optional USM system allocations documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#22526 opened Apr 29, 2026 by ifdu Loading…
ggml-cpu: optimize ggml_gemm_q4_K_8x8_q8_K interleaving/staging for AVX-512 (and AVX2) ggml changes relating to the ggml tensor library for machine learning
#22525 opened Apr 29, 2026 by HyeongiJeon Loading…
Programmatic Dependent Launch (PDL) for more performance on newer NVIDIA GPUs (Hopper+) ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#22522 opened Apr 29, 2026 by aendk Contributor Draft
mtmd : add Nemotron 3 Nano Omni support (parakeet) examples python python script changes
#22520 opened Apr 29, 2026 by danbev Member Draft
Update build.md with commands for nvidia-smi in Override Compute Capability Specifications documentation Improvements or additions to documentation
#22519 opened Apr 29, 2026 by DoctorD90 Loading…
ggml-metal: implement async 2D tensor copy functions Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning
#22515 opened Apr 29, 2026 by ggerganov Member Draft
1 task done
vulkan: add get/set tensor 2d functions AMD ZenDNN Issues related to the AMD ZenDNN backend Apple Metal https://en.wikipedia.org/wiki/Metal_(API) Ascend NPU issues specific to Ascend NPUs ggml changes relating to the ggml tensor library for machine learning Hexagon IBM zDNN issues specific to IBM zDNN Accelerator Nvidia GPU Issues specific to Nvidia GPUs OpenCL Issues specific to the OpenCL backend SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language Vulkan Issues specific to the Vulkan backend WebGPU
#22514 opened Apr 29, 2026 by 0cc4m Contributor Loading…
nix: added dev shells for more backends and updated flake.lock devops improvements to build systems and github actions nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment
#22509 opened Apr 29, 2026 by gpayer Loading…
Stop qwen3.6 from outputting empty <think> blocks
#22507 opened Apr 29, 2026 by michaelw9999 Contributor Loading…
ggml-cpu: add RVV implementation for q1_0 x q8_0 vec dot ggml changes relating to the ggml tensor library for machine learning
#22500 opened Apr 29, 2026 by velonica0 Loading…
Update llama-mmap to work with 32-bit emscripten
#22497 opened Apr 29, 2026 by reeselevine Contributor Loading…
feat: Add Mimo v2.5 model support model Model specific python python script changes
#22493 opened Apr 29, 2026 by AesSedai Contributor Loading…
gguf-py: shrink layers or embedding vectors for reducing model size. python python script changes
#22485 opened Apr 28, 2026 by tiehexue Loading…
cmake: Assign the include path for ggml.h to the ggml::ggml target ggml changes relating to the ggml tensor library for machine learning
#22482 opened Apr 28, 2026 by SchaichAlonso Draft
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.