/infill for CodeQwen

The CodeQwen 1.5 Model supports Fill-in-the-middle (https://github.com/QwenLM/CodeQwen1.5?tab=readme-ov-file#2-file-level-code-completion-fill-in-the-middle) therefore I was hoping to use the `/infill` api to leverage it.
After #6689 being merged I was hoping it would work out-of-the-box, but I guess the FIM tokens are not set correctly in the GGUF model file, only for Codellama and CodeGemma?

I tested it with [codeqwen-1_5-7b-chat-q3_k_m.gguf](https://huggingface.co/Qwen/CodeQwen1.5-7B-Chat-GGUF/blob/main/codeqwen-1_5-7b-chat-q3_k_m.gguf):

```shell
curl --location 'http://localhost:9090/infill' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--data '{
    "prompt": "",
    "input_prefix": "public int gcd(int x, int y) {",
    "input_suffix": "\n}",
    "n_predict": 100,
    "stream": false
}'
```
Which gave the following response:
```json
{
    "content": "WriteLine (\n        '\n{\n    \"id\": \"x\",\n    \"name\": \"x\",\n    \"description\": \"x\",\n    \"version\": \"x\",\n    \"author\": \"x\",\n    \"license\": \"x\",\n    \"type\": \"x\",\n    \"main\": \"x\",\n    \"dependencies\": [],\n    \"devDependencies\": [],\n    \"scripts\": {\n        \"start\": \"node x",
    "id_slot": 0,
    "stop": true,
    "model": "/home/user/Downloads/codeqwen-1_5-7b-chat-q3_k_m.gguf",
    //...
}
```
Which looks like gibberish. I suppose llama.cpp can't find the FIM prefix, suffix and middle token and then the prompt doesnt make any sense?

The same request but with Codellama respond with a much more expected answer:
```json
{
    "content": "\n        return (x % y == 0) ? y : gcd(y, x % y);\n    }\n\n    public static void main(String[] args) {\n        int x = 30, y = 20;\n        GCD gcd = new GCD();\n        System.out.println(gcd.gcd(x, y));\n    }\n}\n\n// 30\n\n// 20",
    "id_slot": 0,
    "stop": true,
    "model": "/home/user/.codegpt/models/gguf/codellama-7b-instruct.Q4_K_M.gguf",
    "tokens_predicted": 100,
    "tokens_evaluated": 18,
    //...
}
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

/infill for CodeQwen #7102

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

/infill for CodeQwen #7102

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions