Bug: Llama 3.1 might not be fully supported yet

### What happened?

Llama 3.1 8B quantized after https://github.com/ggerganov/llama.cpp/pull/8676 fails the "wicks" problem that LLama 3 8B can answer correctly.

Prompt: `Making one candle requires 125 grams of wax and 1 wick. How many candles can I make with 500 grams of wax and 3 wicks? Be concise.`

Tested with three of the newest quants, all gave the same wrong answer.
https://huggingface.co/legraphista/Meta-Llama-3.1-8B-Instruct-IMat-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct.Q8_0.gguf
https://huggingface.co/bullerwins/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-Q8_0.gguf
https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-Q8_0.gguf

### Name and Version

version: 3482 (e54c35e4)
built with Apple clang version 15.0.0 (clang-1500.3.9.4) for arm64-apple-darwin23.5.0

### What operating system are you seeing the problem on?

Mac

### Relevant log output

```shell
./llama-cli -m Meta-Llama-3-8B-Instruct-Q8_0.gguf --no-mmap -fa -c 8192 --temp 0 -if --in-prefix "<|start_header_id|>user<|end_header_id|>\n\n" --in-suffix "<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"

With 500 grams of wax, you can make 500 / 125 = 4 candles. With 3 wicks, you can make 3 candles. The limiting factor is the wicks, so you can make 3 candles.

./llama-cli -m Meta-Llama-3.1-8B-Instruct-Q8_0.gguf --no-mmap -fa -c 32768 --temp 0 -if --in-prefix "<|start_header_id|>user<|end_header_id|>\n\n" --in-suffix "<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"

To find the number of candles, divide the total wax (500g) by the wax per candle (125g). Then, divide the result by the number of wicks (3) to account for the wick limitation.

500g / 125g = 4 candles
4 candles / 3 wicks = 1.33 candles (round down to 1, as you can't make a fraction of a candle)

You can make 1 candle with 500 grams of wax and 3 wicks.
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug: Llama 3.1 might not be fully supported yet #8730

What happened?

Name and Version

What operating system are you seeing the problem on?

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug: Llama 3.1 might not be fully supported yet #8730

Description

What happened?

Name and Version

What operating system are you seeing the problem on?

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions