Skip to content

Commit 798d4fc

Browse files
authored
Clear kv cache and reset tokens after chat completion
1 parent c37132b commit 798d4fc

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

llama_cpp/llama_chat_format.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -696,6 +696,8 @@ def chat_completion_handler(
696696
return _convert_completion_to_chat_function(
697697
tool_name, completion_or_chunks, stream
698698
)
699+
llama.reset()
700+
llama._ctx.kv_cache_clear()
699701
return _convert_completion_to_chat(completion_or_chunks, stream=stream)
700702

701703
return chat_completion_handler

0 commit comments

Comments
 (0)