Releases: huydt84/llama.cpp
Releases · huydt84/llama.cpp
b5994
ggml-cpu : disable GGML_NNPA by default due to instability (#14880) * docs: update s390x document for sentencepiece Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> (cherry picked from commit e086c5e3a7ab3463d8e0906efcfa39352db0a48d) * docs: update huggingface links + reword Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> (cherry picked from commit 8410b085ea8c46e22be38266147a1e94757ef108) * ggml-cpu: disable ggml-nnpa compile flag by default fixes #14877 Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> (cherry picked from commit 412f4c7c88894b8f55846b4719c76892a23cfe09) * docs: update s390x build docs to reflect nnpa disable Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> (cherry picked from commit c1eeae1d0c2edc74ab9fbeff2707b0d357cf0b4d) --------- Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
b5930
CUDA: set_rows + cpy.cu refactor (#14712)
b5921
llama : fix parallel processing for lfm2 (#14705)
b5797
ci : disable fast-math for Metal GHA CI (#14478) * ci : disable fast-math for Metal GHA CI ggml-ci * cont : remove -g flag ggml-ci
b5787
Add Conv2d for CPU (#14388) * Conv2D: Add CPU version * Half decent * Tiled approach for F32 * remove file * Fix tests * Support F16 operations * add assert about size * Review: further formatting fixes, add assert and use CPU version of fp32->fp16
b5682
model : add NeoBERT (#14164) * convert neobert model to gguf * add inference graph * fix flake8 lint * followed reviewer suggestions Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * follow reviewers suggestions Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * override NeoBERT feed-forward length --------- Co-authored-by: dinhhuy <huy.dinh@brains-tech.co.jp> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
b5679
ggml: Add Android support for GGML_CPU_ALL_VARIANTS (#14206)
b5674
model : Add support for Arcee AI's upcoming AFM model (#14185) * Add Arcee AFM support * Add draft update code * Fix linter and update URL, may still not be final * Update src/llama-model.cpp Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> * Remote accidental blank line --------- Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>
b5669
kv-cache : fix use-after-move of defrag info (#14189) ggml-ci
b5650
cmake : Improve build-info.cpp generation (#14156) * cmake: Simplify build-info.cpp generation The rebuild of build-info.cpp still gets triggered when .git/index gets changes. * cmake: generate build-info.cpp in build dir