Skip to content

Crash on x86 with llama-cpp-python with docker or on host directly #753

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
4 tasks done
chenqiny opened this issue Sep 26, 2023 · 3 comments
Open
4 tasks done

Crash on x86 with llama-cpp-python with docker or on host directly #753

chenqiny opened this issue Sep 26, 2023 · 3 comments
Labels
bug Something isn't working build

Comments

@chenqiny
Copy link

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

Load the model

Current Behavior

docker run --rm -it -p 9996:8000 -v /data/gguf/:/models -e MODEL=/models/llama-2-13b-chat.Q4_0.gguf ghcr.io/abetlen/llama-cpp-python:latest

python3 -m pip install -e .
Obtaining file:///app
Installing build dependencies ... done
Checking if build backend supports build_editable ... done
Getting requirements to build editable ... done
Installing backend dependencies ... done
Preparing editable metadata (pyproject.toml) ... done
Requirement already satisfied: typing-extensions>=4.5.0 in /usr/local/lib/python3.11/site-packages (from llama_cpp_python==0.2.7) (4.8.0)
Requirement already satisfied: numpy>=1.20.0 in /usr/local/lib/python3.11/site-packages (from llama_cpp_python==0.2.7) (1.26.0)
Requirement already satisfied: diskcache>=5.6.1 in /usr/local/lib/python3.11/site-packages (from llama_cpp_python==0.2.7) (5.6.3)
Building wheels for collected packages: llama_cpp_python
Building editable for llama_cpp_python (pyproject.toml) ... done
Created wheel for llama_cpp_python: filename=llama_cpp_python-0.2.7-cp311-cp311-manylinux_2_31_x86_64.whl size=911317 sha256=b77877c90bdba00e257432c49978a075519f5818f17e14ecc00db21c1fd6998c
Stored in directory: /tmp/pip-ephem-wheel-cache-ivqpfggy/wheels/57/0f/98/bb57b2b57b95807699b822a35c022f139d38a02c27922f27ce
Successfully built llama_cpp_python
Installing collected packages: llama_cpp_python
Attempting uninstall: llama_cpp_python
Found existing installation: llama_cpp_python 0.2.7
Uninstalling llama_cpp_python-0.2.7:
Successfully uninstalled llama_cpp_python-0.2.7
Successfully installed llama_cpp_python-0.2.7
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Illegal instruction (core dumped)

Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

  • Physical (or virtual) hardware you are using, e.g. for Linux:

$ lscpu

aiu-test:/data/gguf # lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         46 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  24
  On-line CPU(s) list:   0-23
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Xeon(R) CPU E5-2440 0 @ 2.40GHz
    CPU family:          6
    Model:               45
    Thread(s) per core:  2
    Core(s) per socket:  6
    Socket(s):           2
    Stepping:            7
    CPU max MHz:         2900.0000
    CPU min MHz:         1200.0000
    BogoMIPS:            4799.98
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc
                         cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpri
                         ority ept vpid xsaveopt dtherm ida arat pln pts md_clear flush_l1d
Virtualization features:
  Virtualization:        VT-x
Caches (sum of all):
  L1d:                   384 KiB (12 instances)
  L1i:                   384 KiB (12 instances)
  L2:                    3 MiB (12 instances)
  L3:                    30 MiB (2 instances)
NUMA:
  NUMA node(s):          2
  NUMA node0 CPU(s):     0-5,12-17
  NUMA node1 CPU(s):     6-11,18-23
Vulnerabilities:
  Itlb multihit:         KVM: Mitigation: VMX disabled
  L1tf:                  Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
  Mds:                   Mitigation; Clear CPU buffers; SMT vulnerable
  Meltdown:              Mitigation; PTI
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl and seccomp
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
  Srbds:                 Not affected
  Tsx async abort:       Not affected
  • Operating System, e.g. for Linux:

$ uname -a

Linux aiu-test 5.14.21-150400.24.63-default #1 SMP PREEMPT_DYNAMIC Tue May 2 15:49:04 UTC 2023 (fd0cc4f) x86_64 x86_64 x86_64 GNU/Linux
  • SDK version, e.g. for Linux:
$ python3 --version
$ make --version
$ g++ --version

Python 3.11.5 (main, Sep 20 2023, 11:03:59) [GCC 10.2.1 20210110] on linux

Failure Information (for bugs)

Illegal instruction (core dumped)

Steps to Reproduce

Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.

  1. docker run --rm -it -p 9996:8000 -v /data/gguf/:/models -e MODEL=/models/llama-2-13b-chat.Q4_0.gguf ghcr.io/abetlen/llama-cpp-python:latest

Note: Many issues seem to be regarding functional or performance issues / differences with llama.cpp. In these cases we need to confirm that you're comparing against the version of llama.cpp that was built with your python package, and which parameters you're passing to the context.

Try the following:

  1. git clone https://github.com/abetlen/llama-cpp-python
  2. cd llama-cpp-python
  3. rm -rf _skbuild/ # delete any old builds
  4. python setup.py develop
  5. cd ./vendor/llama.cpp
  6. Follow llama.cpp's instructions to cmake llama.cpp
  7. Run llama.cpp's ./main with the same arguments you previously passed to llama-cpp-python and see if you can reproduce the issue. If you can, log an issue with llama.cpp

I tried it, then I got

root@51b054c89440:/work/llama-cpp-python/vendor/llama.cpp/build/bin# ./main
Log start
main: warning: changing RoPE frequency base to 0 (default 10000.0)
main: warning: scaling RoPE frequency by 0 (default 1.0)
main: build = 1271 (a98b163)
main: built with cc (Debian 12.2.0-14) 12.2.0 for x86_64-linux-gnu
main: seed = 1695717956
Illegal instruction (core dumped)

Failure Logs

Please include any relevant log snippets or files. If it works under one configuration but not under another, please provide logs for both configurations and their corresponding outputs so it is easy to see where behavior changes.

Also, please try to avoid using screenshots if at all possible. Instead, copy/paste the console output and use Github's markdown to cleanly format your logs for easy readability.

/work/llama-cpp-python/vendor/llama.cpp/build/bin# git log | head -1
commit a98b1633d5a94d0aa84c7c16e1f8df5ac21fc850

@chenqiny
Copy link
Author

I opened an issue to llama.cpp. If it it built by cmake, then I will get same issue.

ggml-org/llama.cpp#3339

@chenqiny
Copy link
Author

I used work around.

1.download llamacpp code

2. make

3. make libllama.so

4. overwrite libllama.so in llama-cpp-python

@Saivignesh-05
Copy link

thanks @chenqiny. I also had the issue of illegal instruction. Your solution works!

@abetlen abetlen added bug Something isn't working build labels Dec 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working build
Projects
None yet
Development

No branches or pull requests

3 participants