llamacpp_python container may not work on all CPUs

llama-cpp detects CPU features like AVX, AVX2, FMA3, and F16C at build time. If the container is built on a machine that supports these instruction sets, then the binary won't work on CPUs without these instructions. 

https://github.com/containers/ai-lab-recipes/blob/96555a1a8dd517b499933b66ba09ac4a248a0bb6/model_servers/llamacpp_python/cuda/Containerfile#L4-L7

References:
- https://github.com/ggerganov/llama.cpp/blob/d26e8b669dbf1f5f5a0afe4d2d885e86cf566302/CMakeLists.txt#L73-L78
- https://github.com/abetlen/llama-cpp-python/issues/412

Credits to @bbrowning for figuring this out. He suggested to build llama-cpp-python with `CMAKE_ARGS="-DLLAMA_CUBLAS=on -DLLAMA_AVX2=OFF -DLLAMA_FMA=OFF -DLLAMA_F16C=OFF"`

	RUN pip install --upgrade pip
	ENV CMAKE_ARGS="-DLLAMA_CUBLAS=on"
	ENV FORCE_CMAKE=1
	RUN pip install --no-cache-dir --upgrade -r /locallm/requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

llamacpp_python container may not work on all CPUs #243

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

llamacpp_python container may not work on all CPUs #243

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions