-
Notifications
You must be signed in to change notification settings - Fork 17k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
mtmd: add pos_0 to mtmd_image_tokens_get_decoder_pos (breaking change)
examples
#22082
opened Apr 18, 2026 by
ngxson
Contributor
Loading…
server: always include usage in streaming responses
examples
server
#22081
opened Apr 18, 2026 by
brywil
Loading…
model : refactor bias tensor names
model
Model specific
refactoring
Refactoring
#22079
opened Apr 18, 2026 by
CISC
Member
Loading…
[SYCL] Update oneapi 2025.3.3, Seperate SYCL build, release Ubuntu 24 package.
devops
improvements to build systems and github actions
documentation
Improvements or additions to documentation
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#22078
opened Apr 18, 2026 by
NeoZhangJianyu
Contributor
Loading…
common/autoparser : allow space after tool call
testing
Everything test related
#22073
opened Apr 18, 2026 by
aldehir
Contributor
Loading…
sycl: Battlemage (BMG) optimizations — AOT, Q5_K reorder, PAD stride fix, new ops, oneMKL routing
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#22066
opened Apr 17, 2026 by
aicss-genai
Loading…
GGML: Allow static build with dynamic loaded backends
ggml
changes relating to the ggml tensor library for machine learning
#22059
opened Apr 17, 2026 by
ervanalb
Loading…
2 tasks done
quant: handle shared-KV layer tensors in imatrix-dependent quantization
testing
Everything test related
#22054
opened Apr 17, 2026 by
ajfonthemove
Loading…
3 tasks
CUDA: refactor mma data loading for AMD
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#22051
opened Apr 17, 2026 by
JohannesGaessler
Contributor
Loading…
Reduce CPU overhead in meta backend: cache subgraph splits when cgraph is unchanged
ggml
changes relating to the ggml tensor library for machine learning
#22041
opened Apr 17, 2026 by
gaugarg-nv
Contributor
Loading…
server: Skip API key verification for static files
examples
server
#22038
opened Apr 17, 2026 by
roj234
Contributor
Loading…
mtmd, llama : Update HunyuanVL vision-language model support
examples
model
Model specific
python
python script changes
#22037
opened Apr 17, 2026 by
ManaEstras
Loading…
3 tasks done
server: log prompts to directory
examples
server
#22031
opened Apr 17, 2026 by
jacekpoplawski
Contributor
•
Draft
mtmd, llama, ggml : Update HunyuanVL support
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
examples
ggml
changes relating to the ggml tensor library for machine learning
model
Model specific
Nvidia GPU
Issues specific to Nvidia GPUs
OpenCL
Issues specific to the OpenCL backend
python
python script changes
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
testing
Everything test related
Vulkan
Issues specific to the Vulkan backend
#22029
opened Apr 17, 2026 by
ManaEstras
Loading…
3 tasks done
server: Expose
media_marker on /props endpoint.
examples
server
#22028
opened Apr 17, 2026 by
cetarthoriphros
Loading…
llama-mmap: add MADV_HUGEPAGE hint for THP on Linux
#22022
opened Apr 16, 2026 by
Marxist-Leninist
Contributor
Loading…
ggml-vulkan/CMakeLists: add a check for SPIRV-Headers
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
sampling: add segment-level repetition loop detection
examples
server
#22007
opened Apr 16, 2026 by
Frank-Schruefer
Loading…
opencl: workaround Adreno LLVM compiler SIGSEGV in subgroup arithmetic ops
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
#22006
opened Apr 16, 2026 by
RokketCrypto
Loading…
Previous Next
ProTip!
Adding no:label will show everything without a label.