Skip to content

server: recieving <|im_end|> in all responses of llama 3 #6873

Closed
@infozzdatalabs

Description

@infozzdatalabs

I have been experiencing this problem since I have tried different models of llama 3 for example:
https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF
https://huggingface.co/QuantFactory/dolphin-2.9-llama3-8b-GGUF

In all responses calling "/chat/completions" returns at the end the '<|im_end|>'.

I'm using the latest docker version for cuda: 'ghcr.io/ggerganov/llama.cpp:server-cuda'

Thanks in advance.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions