Is your feature request related to a problem? Please describe.
Amd apu platforms benefit from caching optimizations, which this fork of llama.cpp provides.
Github.com/fewtarius/CachyLLama
Describe the solution you'd like
Cachyllama implementes as backend option.
Describe alternatives you've considered
Additional context
Is your feature request related to a problem? Please describe.
Amd apu platforms benefit from caching optimizations, which this fork of llama.cpp provides.
Github.com/fewtarius/CachyLLama
Describe the solution you'd like
Cachyllama implementes as backend option.
Describe alternatives you've considered
Additional context