Misc. bug: llama fails to run on older x86 hardware.

### Name and Version

using latest docker image
build: 5097 (fe5b78c8) with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu

### Operating systems

Linux

### Which llama.cpp modules do you know to be affected?

llama-cli

### Command line

```shell
ramalama pull tiny
podman run --rm --pull=newer --device nvidia.com/gpu=all --volume /home/kraxel/.local/share/ramalama:/models ghcr.io/ggml-org/llama.cpp:full-cuda --run -m /models/models/ollama/tinyllama:latest -p hello
```

### Problem description & steps to reproduce

```
podman run --rm --pull=newer --device nvidia.com/gpu=all --volume /home/kraxel/.local/share/ramalama:/models ghcr.io/ggml-org/llama.cpp:full-cuda --run -m /models/models/ollama/tinyllama:latest -p hello
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices: 
  Device 0: NVIDIA GeForce GTX 1060 6GB, compute capability 6.1, VMM: yes
load_backend: loaded CUDA backend from /app/libggml-cuda.so
[ ... ]
load_tensors: loading model tensors, this can take a while... (mmap = true)
```
Finds the GPU, loads the model, then just stops.

The linux kernel logs a segfault:
```
[ 2408.935610] llama-cli[3154]: segfault at 78 ip 00007f92766fe4d4 sp 00007fff1bbe0f78 error 4 in libggml-base.so[284d4,7f92766e7000+63000] likely on CPU 3 (core 3, socket 0)
[ 2408.935673] Code: 84 00 00 00 00 00 f3 0f 1e fa 66 0f ef c0 48 c7 46 20 00 00 00 00 0f 11 06 0f 11 46 10 ff 67 20 66 0f 1f 44 00 00 f3 0f 1e fa <48> 8b 47 78 c3 0f 1f 80 00 00 00 00 f3 0f 1e fa ff 67 28 66 0f 1f
```
The CPU is older and has no AVX vector instructions.  Apparently llama uses AVX without checking beforehand the CPU actually supports these instructions.

Ran into this with ramalama first, see https://github.com/containers/ramalama/issues/1145, where I see the same behavior but a slightly different kernel error message:
```
[ 1767.875857] traps: llama-run[2356] trap invalid opcode ip:7ff5c06812ac sp:7ffc2d06c4e0 error:0 in libggml-cpu.so[3a2ac,7ff5c064f000+60000]
```

### First Bad Commit

_No response_

### Relevant log output

```shell

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Misc. bug: llama fails to run on older x86 hardware. #12866

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Misc. bug: llama fails to run on older x86 hardware. #12866

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions