-
Notifications
You must be signed in to change notification settings - Fork 1
Pull requests: auroralabs-loci/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
UPSTREAM PR #21405: vendor : update cpp-httplib to 0.40.1
#1331
opened Apr 4, 2026 by
loci-dev
Loading…
UPSTREAM PR #21331: docs: build.md / HSA_OVERRIDE_GFX_VERSION does not exist on Windows
#1330
opened Apr 3, 2026 by
loci-dev
Loading…
UPSTREAM PR #21315: ggml-zendnn : add MUL_MAT_ID op support for MoE models
#1329
opened Apr 3, 2026 by
loci-dev
Loading…
UPSTREAM PR #21245: model : refactor QKV into common build_qkv and create_tensor_qkv helpers
#1328
opened Apr 2, 2026 by
loci-dev
Loading…
UPSTREAM PR #20831: cuda : dynamic MMVQ nwarps for narrow matrices
#1327
opened Apr 2, 2026 by
loci-dev
Loading…
UPSTREAM PR #21283: [SYCL] fix llama_kv_cache hang when kv_cache is huge: 5GB
#1326
opened Apr 2, 2026 by
loci-dev
Loading…
UPSTREAM PR #21242: fix: tool call parsing for LFM2 and LFM2.5 models
#1325
opened Apr 1, 2026 by
loci-dev
Loading…
UPSTREAM PR #21240: Relax prefill parser to allow space.
#1324
opened Apr 1, 2026 by
loci-dev
Loading…
UPSTREAM PR #21051: Add the tests that we want to run on external CI
#1323
opened Apr 1, 2026 by
loci-dev
Loading…
UPSTREAM PR #21046: ggml webgpu: move quantized buffers to u32 types and some other changes for wider browser/device support
#1322
opened Apr 1, 2026 by
loci-dev
Loading…
UPSTREAM PR #21203: server: respect the ignore eos flag
#1320
opened Mar 31, 2026 by
loci-dev
Loading…
UPSTREAM PR #21168: ggml-cuda: ds_read_b128 for q4_0 and q4_1 mmq kernels
#1319
opened Mar 30, 2026 by
loci-dev
Loading…
UPSTREAM PR #21122: CI: Enable CUDA and Vulkan ARM64 runners and fix CI/CD
#1318
opened Mar 30, 2026 by
loci-dev
Loading…
UPSTREAM PR #21095: convert: Add compressed-tensors NVFP4 conversion
#1317
opened Mar 30, 2026 by
loci-dev
Loading…
UPSTREAM PR #20275: model: add sarvam_moe architecture support
#1316
opened Mar 30, 2026 by
loci-dev
Loading…
UPSTREAM PR #21139: grammar: make MAX_REPETITION_THRESHOLD configurable via env var
#1315
opened Mar 29, 2026 by
loci-dev
Loading…
UPSTREAM PR #21003: grammar: increase MAX_REPETITION_THRESHOLD + make it configurable via envvar
#1314
opened Mar 29, 2026 by
loci-dev
Loading…
UPSTREAM PR #21082: common: add bounds check in common_init_result::sampler to prevent segfault on failed model load
#1313
opened Mar 29, 2026 by
loci-dev
Loading…
UPSTREAM PR #21066: [HIP] Bump ROCm version to 7.2.1
#1312
opened Mar 29, 2026 by
loci-dev
Loading…
3 tasks
UPSTREAM PR #21089: ggml : add CPU TurboQuant KV cache types (TBQ3_0 / TBQ4_0)
#1311
opened Mar 28, 2026 by
loci-dev
Loading…
UPSTREAM PR #21085: common/parser: fix reasoning whitespace bugs + extra parser tests
#1310
opened Mar 28, 2026 by
loci-dev
Loading…
UPSTREAM PR #21075: fix cmake problem to exclude CCAN
#1309
opened Mar 28, 2026 by
loci-dev
Loading…
UPSTREAM PR #21074: ggml-cuda: Add generic NVFP4 MMQ kernel
#1308
opened Mar 28, 2026 by
loci-dev
Loading…
UPSTREAM PR #20991: ci: add riscv64 to release binaries
#1307
opened Mar 28, 2026 by
loci-dev
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.