Closed as not planned
Closed as not planned
Description
When using llama.cpp via LocalAI, "failed to find free space in the KV cache" will be displayed after using it for a while, and the string that can respond will gradually become shorter, and eventually it will not be possible to respond.
I looked at past issues such as #4185, but I don't really understand how to solve them.
I would like to delete the KV cache or increase the capacity of the KV cache, but how should I do that?
The version of llama.cpp seems to be 8228b66.