Skip to content

Getting SIGSEGV with llama backend #973

@iamjackg

Description

@iamjackg

LocalAI version:
v1.25.0

Environment, CPU architecture, OS, and Version:
Linux hostname 5.15.0-78-generic #85-Ubuntu SMP Fri Jul 7 15:25:09 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

  • 13th Gen Intel(R) Core(TM) i7-13700KF
  • 3070 Ti

Describe the bug
Trying to run any GGUF model with the llama backend results in SIGSEGV as soon as the model tries to load. (output in Logs section)

Note that running the main binary of llama.cpp from LocalAI/go-llama/build/bin/ runs totally fine, e.g.

./main -t 6 --low-vram -m ~/gits/llama.cpp/models/phind-codellama-34b-v1.Q4_K_M.gguf --temp 0 -ngl 14 --color --rope-freq-base 1e6 -p $'# this python function determines whether an object is JSON-serializable or not, without using json.dumps\ndef is_json_serializable(thing):'

To Reproduce

Any request seems to do this. I tried with both codellama-13b-python.Q4_K_S.gguf and phind-codellama-34b-v1.Q4_K_M.gguf for good measure. Both work when running llama.cpp directly.

Expected behavior

Logs

11:50PM DBG Loading GRPC Model llama: {backendString:llama model:codellama-13b-python.Q4_K_S.gguf threads:6 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc0002dc000 externalBackends:map[] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false}
11:50PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/llama
11:50PM DBG GRPC Service for codellama-13b-python.Q4_K_S.gguf will be running at: '127.0.0.1:38915'
11:50PM DBG GRPC Service state dir: /tmp/go-processmanager2204529450
11:50PM DBG GRPC Service Started
rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:38915: connect: connection refused"
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr 2023/08/28 23:50:18 gRPC Server listening at 127.0.0.1:38915
11:50PM DBG GRPC Service Ready
11:50PM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:codellama-13b-python.Q4_K_S.gguf ContextSize:4096 Seed:0 NBatch:512 F16Memory:false MLock:false MMap:false VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:14 MainGPU: TensorSplit: Threads:6 LibrarySearchPath: RopeFreqBase:1e+06 RopeFreqScale:1 RMSNormEps:0 NGQA:0 ModelFile:/home/jack/gits/llama.cpp/models/codellama-13b-python.Q4_K_S.gguf Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 Tokenizer: LoraBase: LoraAdapter: NoMulMatQ:false}
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr create_gpt_params: loading model /home/jack/gits/llama.cpp/models/codellama-13b-python.Q4_K_S.gguf
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr SIGSEGV: segmentation violation
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr PC=0x7f5937be9fbd m=5 sigcode=1
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr signal arrived during cgo execution
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr 
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr goroutine 34 [syscall]:
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr runtime.cgocall(0x81cbd0, 0xc0001815f0)
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr 	/usr/local/go/src/runtime/cgocall.go:157 +0x4b fp=0xc0001815c8 sp=0xc000181590 pc=0x4161cb
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr github.com/go-skynet/go-llama%2ecpp._Cfunc_load_model(0x7f58b4000b70, 0x1000, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xe, 0x200, ...)
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr 	_cgo_gotypes.go:266 +0x4c fp=0xc0001815f0 sp=0xc0001815c8 pc=0x8131ac
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr github.com/go-skynet/go-llama%2ecpp.New({0xc0002a8000, 0x41}, {0xc000110700, 0x8, 0x926a60?})
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr 	/home/jack/gits/LocalAI/go-llama/llama.go:39 +0x3aa fp=0xc0001817f0 sp=0xc0001815f0 pc=0x813a6a
[...]

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions