Skip to content

The prompt processing takes ages now #893

Closed
@BadisG

Description

@BadisG

Hello,
Now that I've updated llama.cpp to lama_cpp_python-0.2.14, I've noticed that when a new prompt is ready to be processed, it takes a lot of time (usually ~50 tokens context takes less than 2 seconds, now it takes 40 seconds!), the following outputs (with the same prompts) are ok though.
image
I also noticed that when I start the output just after "ASSISTANT:" (for Xwin), it just breaks the output, it never happened before
image

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions