The prompt processing takes ages now

Hello,
Now that I've updated llama.cpp to lama_cpp_python-0.2.14, I've noticed that when a new prompt is ready to be processed, it takes a lot of time (usually ~50 tokens context takes less than 2 seconds, now it takes 40 seconds!), the following outputs (with the same prompts) are ok though.
![image](https://github.com/oobabooga/text-generation-webui/assets/110173477/df847d49-de78-4ace-be80-ad7281c67849)
I also noticed that when I start the output just after "ASSISTANT:" (for Xwin), it just breaks the output, it never happened before
![image](https://github.com/abetlen/llama-cpp-python/assets/110173477/1aa66659-6a7d-4207-8dfe-113e9e34ac6f)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The prompt processing takes ages now #893

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

The prompt processing takes ages now #893

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions