-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
LocalAI version:
v1.25.0
Environment, CPU architecture, OS, and Version:
Linux hostname 5.15.0-78-generic #85-Ubuntu SMP Fri Jul 7 15:25:09 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
- 13th Gen Intel(R) Core(TM) i7-13700KF
- 3070 Ti
Describe the bug
Trying to run any GGUF model with the llama
backend results in SIGSEGV as soon as the model tries to load. (output in Logs section)
Note that running the main
binary of llama.cpp from LocalAI/go-llama/build/bin/
runs totally fine, e.g.
./main -t 6 --low-vram -m ~/gits/llama.cpp/models/phind-codellama-34b-v1.Q4_K_M.gguf --temp 0 -ngl 14 --color --rope-freq-base 1e6 -p $'# this python function determines whether an object is JSON-serializable or not, without using json.dumps\ndef is_json_serializable(thing):'
To Reproduce
Any request seems to do this. I tried with both codellama-13b-python.Q4_K_S.gguf
and phind-codellama-34b-v1.Q4_K_M.gguf
for good measure. Both work when running llama.cpp directly.
Expected behavior
Logs
11:50PM DBG Loading GRPC Model llama: {backendString:llama model:codellama-13b-python.Q4_K_S.gguf threads:6 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc0002dc000 externalBackends:map[] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false}
11:50PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/llama
11:50PM DBG GRPC Service for codellama-13b-python.Q4_K_S.gguf will be running at: '127.0.0.1:38915'
11:50PM DBG GRPC Service state dir: /tmp/go-processmanager2204529450
11:50PM DBG GRPC Service Started
rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:38915: connect: connection refused"
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr 2023/08/28 23:50:18 gRPC Server listening at 127.0.0.1:38915
11:50PM DBG GRPC Service Ready
11:50PM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:codellama-13b-python.Q4_K_S.gguf ContextSize:4096 Seed:0 NBatch:512 F16Memory:false MLock:false MMap:false VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:14 MainGPU: TensorSplit: Threads:6 LibrarySearchPath: RopeFreqBase:1e+06 RopeFreqScale:1 RMSNormEps:0 NGQA:0 ModelFile:/home/jack/gits/llama.cpp/models/codellama-13b-python.Q4_K_S.gguf Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 Tokenizer: LoraBase: LoraAdapter: NoMulMatQ:false}
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr create_gpt_params: loading model /home/jack/gits/llama.cpp/models/codellama-13b-python.Q4_K_S.gguf
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr SIGSEGV: segmentation violation
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr PC=0x7f5937be9fbd m=5 sigcode=1
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr signal arrived during cgo execution
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr goroutine 34 [syscall]:
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr runtime.cgocall(0x81cbd0, 0xc0001815f0)
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr /usr/local/go/src/runtime/cgocall.go:157 +0x4b fp=0xc0001815c8 sp=0xc000181590 pc=0x4161cb
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr github.com/go-skynet/go-llama%2ecpp._Cfunc_load_model(0x7f58b4000b70, 0x1000, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xe, 0x200, ...)
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr _cgo_gotypes.go:266 +0x4c fp=0xc0001815f0 sp=0xc0001815c8 pc=0x8131ac
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr github.com/go-skynet/go-llama%2ecpp.New({0xc0002a8000, 0x41}, {0xc000110700, 0x8, 0x926a60?})
11:50PM DBG GRPC(codellama-13b-python.Q4_K_S.gguf-127.0.0.1:38915): stderr /home/jack/gits/LocalAI/go-llama/llama.go:39 +0x3aa fp=0xc0001817f0 sp=0xc0001815f0 pc=0x813a6a
[...]
orandev, donghwuy-ko and GravyPouch
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working