-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Clean install fails to run any model #5225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I seem to be having the same problem. no models, not even the included ones, are working. using v2.28.0 nvidia gpu 12 gigs vram, unraid docker all in one nvidia cuda 12 image. here's quite a lot of log, all that showed in log window. seems it failed with internal loader so it tried to use a back end instead and still failed. 6:52PM DBG GRPC(gpt-4-127.0.0.1:40951): stderr runtime.netpollblock(0x4dd3d8?, 0x41dce6?, 0x0?) 6:54PM DBG Prompt (before templating): <|im_start|>user 6:54PM DBG Template found, input modified to: <|im_start|>user 6:54PM DBG Prompt (after templating): <|im_start|>user 6:54PM DBG Stream request received 6:54PM DBG Loading from the following backends (in order): [llama-cpp llama-cpp-fallback piper silero-vad stablediffusion-ggml whisper bark-cpp huggingface /build/backend/python/exllama2/run.sh /build/backend/python/diffusers/run.sh /build/backend/python/kokoro/run.sh /build/backend/python/rerankers/run.sh /build/backend/python/coqui/run.sh /build/backend/python/autogptq/run.sh /build/backend/python/bark/run.sh /build/backend/python/faster-whisper/run.sh /build/backend/python/vllm/run.sh /build/backend/python/transformers/run.sh] |
Does the previous version work for you? Did you install from Docker or somewhere else? If you are using NVIDIA you could try v2.27.0-cublas-cuda12-ffmpeg to see if this is new bug. I'm not sure that in either case the log contains the root cause. Could you upload the full log as an attachment? |
I'm not sure to who you were replying. but in my original message I stated how I was running, what image, and that it was under nvidia. so I'll assume it was to the other guy. gave as much log as I could recover from the virtual machine. it ran beyond the log buffer so I'll need to obtain more another way if you were asking me for more. |
found part of the problem on my end. it's attempting to load /build/models/localai-functioncall-qwen2.5-7b-v0.5-q4_k_m.gguf which doesn't exist. Even though it's one of the default bundled models. auto download failing? perhaps the url it's looking for is broken? |
@bstone108 Yes more logs are needed and I think most likely you have a different issue. Although you can also try using the previous version to see if it is a regression. |
As the OP the second poster doesn't have the same problem as me, I have a simple install on a x86 PC with no acceleration hardware etc. 1 model downloaded which fails to load, so, any suggestions? Attached is the debug log from startup, then opening the local web interface, selecting chat and then asking a question. |
I'm facing the same issue after updating from |
I'm on the Switching to a clean install of For the log trace of
Here is the full stack trace, starting from when I made the call via the webui to "gpt4" model
|
It appears the exact point it goes wrong is gguf_init_from_file calls ggml_fopen which calls fopen to open the file as read only. Unfortunately Llama doesn't log what error it failed with, but probably it is the same as what @bstone108 determined and the file path doesn't exist. I have created a PR #5276 that may fix this because there is a typo in the "gpt-4" model definition for aio cuda images. However if this is the cause it should be possible to avoid it by selecting a different model from the gallery. |
LocalAI version:
Latest version I stalled yesterday no idea how to get the tag/commit
Environment, CPU architecture, OS, and Version:
Linux desktop-garage 4.19.0-12-amd64 #1 SMP Debian 4.19.152-1 (2020-10-18) x86_64 GNU/Linux
Describe the bug
Clean install and l download of several models, all of them fail to load
To Reproduce
Install and attempt to run a model
Expected behavior
I expect a model to load
Logs
10:14AM INF Trying to load the model 'minicpm-v-2_6' with the backend '[llama-cpp llama-cpp-fallback piper silero-vad stablediffusion-ggml whisper bark-cpp huggingface]'
10:14AM INF [llama-cpp] Attempting to load
10:14AM INF BackendLoader starting backend=llama-cpp modelID=minicpm-v-2_6 o.model=minicpm-v-2_6-Q4_K_M.gguf
10:14AM INF [llama-cpp] attempting to load with AVX variant
10:14AM INF Success ip=10.8.1.10 latency=767.980683ms method=POST status=200 url=/v1/chat/completions
10:14AM INF Success ip=10.8.1.10 latency="29.688µs" method=GET status=200 url=/static/favicon.svg
10:15AM INF Trying to load the model 'minicpm-v-2_6' with the backend '[llama-cpp llama-cpp-fallback whisper bark-cpp piper silero-vad stablediffusion-ggml huggingface]'
10:15AM INF [llama-cpp] Attempting to load
10:15AM INF BackendLoader starting backend=llama-cpp modelID=minicpm-v-2_6 o.model=minicpm-v-2_6-Q4_K_M.gguf
10:15AM ERR [llama-cpp] Failed loading model, trying with fallback 'llama-cpp-fallback', error: failed to load model with internal loader: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF
10:15AM INF [llama-cpp] attempting to load with AVX variant
10:15AM ERR [llama-cpp] Failed loading model, trying with fallback 'llama-cpp-fallback', error: failed to load model with internal loader: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF
10:15AM INF [llama-cpp] Fails: failed to load model with internal loader: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF
10:15AM INF [llama-cpp-fallback] Attempting to load
10:15AM INF BackendLoader starting backend=llama-cpp-fallback modelID=minicpm-v-2_6 o.model=minicpm-v-2_6-Q4_K_M.gguf
10:16AM INF [llama-cpp] Fails: failed to load model with internal loader: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF
10:16AM INF [llama-cpp-fallback] Attempting to load
10:16AM INF BackendLoader starting backend=llama-cpp-fallback modelID=minicpm-v-2_6 o.model=minicpm-v-2_6-Q4_K_M.gguf
10:16AM INF [llama-cpp-fallback] Fails: failed to load model with internal loader: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF
10:16AM INF [piper] Attempting to load
10:16AM INF BackendLoader starting backend=piper modelID=minicpm-v-2_6 o.model=minicpm-v-2_6-Q4_K_M.gguf
10:16AM INF [llama-cpp-fallback] Fails: failed to load model with internal loader: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF
10:16AM INF [whisper] Attempting to load
10:16AM INF BackendLoader starting backend=whisper modelID=minicpm-v-2_6 o.model=minicpm-v-2_6-Q4_K_M.gguf
10:17AM ERR failed starting/connecting to the gRPC service error="rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:36117: connect: connection refused""
10:17AM INF [piper] Fails: failed to load model with internal loader: grpc service not ready
10:17AM INF [silero-vad] Attempting to load
10:17AM INF BackendLoader starting backend=silero-vad modelID=minicpm-v-2_6 o.model=minicpm-v-2_6-Q4_K_M.gguf
10:17AM ERR failed starting/connecting to the gRPC service error="rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:42421: connect: connection refused""
10:17AM INF [whisper] Fails: failed to load model with internal loader: grpc service not ready
10:17AM INF [bark-cpp] Attempting to load
10:17AM INF BackendLoader starting backend=bark-cpp modelID=minicpm-v-2_6 o.model=minicpm-v-2_6-Q4_K_M.gguf
10:18AM ERR failed starting/connecting to the gRPC service error="rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:39127: connect: connection refused""
10:18AM INF [silero-vad] Fails: failed to load model with internal loader: grpc service not ready
10:18AM INF [stablediffusion-ggml] Attempting to load
10:18AM INF BackendLoader starting backend=stablediffusion-ggml modelID=minicpm-v-2_6 o.model=minicpm-v-2_6-Q4_K_M.gguf
10:19AM ERR failed starting/connecting to the gRPC service error="rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:45537: connect: connection refused""
10:19AM INF [bark-cpp] Fails: failed to load model with internal loader: grpc service not ready
10:19AM INF [piper] Attempting to load
10:19AM INF BackendLoader starting backend=piper modelID=minicpm-v-2_6 o.model=minicpm-v-2_6-Q4_K_M.gguf
10:19AM INF [stablediffusion-ggml] Fails: failed to load model with internal loader: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF
10:19AM INF [whisper] Attempting to load
10:19AM INF BackendLoader starting backend=whisper modelID=minicpm-v-2_6 o.model=minicpm-v-2_6-Q4_K_M.gguf
10:19AM ERR failed starting/connecting to the gRPC service error="rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:39659: connect: connection refused""
10:19AM INF [piper] Fails: failed to load model with internal loader: grpc service not ready
10:19AM INF [silero-vad] Attempting to load
10:19AM INF BackendLoader starting backend=silero-vad modelID=minicpm-v-2_6 o.model=minicpm-v-2_6-Q4_K_M.gguf
Additional context
The basic documentation is very sparse and the default localai.env installed does not match up with the default master on GitHub. all the variables are prefixes with LOCALAI_ in the documentation, but the files installed by default does not e.g. LOCALAI_THREADS VS. THREADS. Also the installer runs the system as a service, but the docs don't mention this anywhere, rather it says run from the command line 'local-ai run' which fails as the service is already running.
I think this may be the same issue as #5216
The text was updated successfully, but these errors were encountered: