"failed to find free space in the KV cache" is displayed and no response is returned.

When using llama.cpp via LocalAI, "failed to find free space in the KV cache" will be displayed after using it for a while, and the string that can respond will gradually become shorter, and eventually it will not be possible to respond.

I looked at past issues such as #4185, but I don't really understand how to solve them.

I would like to delete the KV cache or increase the capacity of the KV cache, but how should I do that?


The version of llama.cpp seems to be 8228b66dbc16290c5cbd70e80ab47c068e2569d8.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

"failed to find free space in the KV cache" is displayed and no response is returned. #6603

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

"failed to find free space in the KV cache" is displayed and no response is returned. #6603

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions