Skip to content

Misc. bug: llama fails to run on older x86 hardware. #12866

@kraxel

Description

@kraxel

Name and Version

using latest docker image
build: 5097 (fe5b78c) with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-cli

Command line

ramalama pull tiny
podman run --rm --pull=newer --device nvidia.com/gpu=all --volume /home/kraxel/.local/share/ramalama:/models ghcr.io/ggml-org/llama.cpp:full-cuda --run -m /models/models/ollama/tinyllama:latest -p hello

Problem description & steps to reproduce

podman run --rm --pull=newer --device nvidia.com/gpu=all --volume /home/kraxel/.local/share/ramalama:/models ghcr.io/ggml-org/llama.cpp:full-cuda --run -m /models/models/ollama/tinyllama:latest -p hello
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices: 
  Device 0: NVIDIA GeForce GTX 1060 6GB, compute capability 6.1, VMM: yes
load_backend: loaded CUDA backend from /app/libggml-cuda.so
[ ... ]
load_tensors: loading model tensors, this can take a while... (mmap = true)

Finds the GPU, loads the model, then just stops.

The linux kernel logs a segfault:

[ 2408.935610] llama-cli[3154]: segfault at 78 ip 00007f92766fe4d4 sp 00007fff1bbe0f78 error 4 in libggml-base.so[284d4,7f92766e7000+63000] likely on CPU 3 (core 3, socket 0)
[ 2408.935673] Code: 84 00 00 00 00 00 f3 0f 1e fa 66 0f ef c0 48 c7 46 20 00 00 00 00 0f 11 06 0f 11 46 10 ff 67 20 66 0f 1f 44 00 00 f3 0f 1e fa <48> 8b 47 78 c3 0f 1f 80 00 00 00 00 f3 0f 1e fa ff 67 28 66 0f 1f

The CPU is older and has no AVX vector instructions. Apparently llama uses AVX without checking beforehand the CPU actually supports these instructions.

Ran into this with ramalama first, see containers/ramalama#1145, where I see the same behavior but a slightly different kernel error message:

[ 1767.875857] traps: llama-run[2356] trap invalid opcode ip:7ff5c06812ac sp:7ffc2d06c4e0 error:0 in libggml-cpu.so[3a2ac,7ff5c064f000+60000]

First Bad Commit

No response

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions