Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

server: fix SWA prompt reuse boundary condition
#21695 opened Apr 9, 2026 by 1oridevs Draft
3 tasks done
debug: functionality to dump full tensors and compare examples python python script changes
#21691 opened Apr 9, 2026 by pwilkin Member Loading…
common: mark --split-mode tensor as experimental
#21684 opened Apr 9, 2026 by JohannesGaessler Contributor Loading…
Bug-Fix sets an upper VRAM limit for cached ggml_cuda graphs to prevent VRAM memory leaks ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#21673 opened Apr 9, 2026 by kmorennv Loading…
common : add fluidity to the progress bar
#21671 opened Apr 9, 2026 by angt Member Loading…
common : fix when loading a cached HF models with unavailable API
#21670 opened Apr 9, 2026 by angt Member Loading…
ggml-webgpu: support non-square subgroup matrix configs for Intel GPUs ggml changes relating to the ggml tensor library for machine learning WebGPU
#21669 opened Apr 9, 2026 by SharmaRithik Loading…
CUDA: fuse muls ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#21665 opened Apr 9, 2026 by am17an Contributor Loading…
Convert: Fix NemotronH Config Parsing python python script changes
#21664 opened Apr 9, 2026 by anavp-nvidia Contributor Loading…
common: load parser in common_chat_parser_params constructor
#21659 opened Apr 9, 2026 by sacredvoid Contributor Loading…
Fix WebUI thinking mode request handling
#21657 opened Apr 9, 2026 by redyuan43 Loading…
docker: add OCI image labels for version and build date devops improvements to build systems and github actions
#21653 opened Apr 9, 2026 by ssam18 Contributor Loading…
ci: add android arm64 build and release
#21647 opened Apr 8, 2026 by ykhrustalev Contributor Loading…
convert : force f16 or f32 on step3-vl conv weights
#21646 opened Apr 8, 2026 by CISC Member Loading…
ggml-webgpu: Update register tiling matmul to use f32 accumulation
#21644 opened Apr 8, 2026 by reeselevine Contributor Loading…
[SYCL] Fix Q8_0 reorder: garbage on 2nd prompt + crash on full VRAM ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#21638 opened Apr 8, 2026 by PMZFX Contributor Loading…
fix(openvino): define PartialShape bounds for tensors
#21637 opened Apr 8, 2026 by thedanhoffman Contributor Loading…
Enable ccache on riscv64
#21632 opened Apr 8, 2026 by luhenry Contributor Draft
cmake: fix CMP0194 warning on Windows with MSVC
#21630 opened Apr 8, 2026 by texasich Loading…
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.