-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Closed
Labels
Description
LocalAI version:
2.8.2
Environment, CPU architecture, OS, and Version:
» uname -a
Darwin Dougs-MacBook-Air.local 23.3.0 Darwin Kernel Version 23.3.0: Wed Dec 20 21:33:31 PST 2023; root:xnu-10002.81.5~7/RELEASE_ARM64_T8112 arm64
Describe the bug
ERR Failed starting/connecting to the gRPC service: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:35053: connect: connection refused"
To Reproduce
» docker run -ti --platform linux/amd64 -p 8080:8080 localai/localai:v2.8.2-ffmpeg-core codellama-7b-gguf
» curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "codellama-7b-gguf",
"messages": [{"role": "user", "content": "Please write a function that calculates the first n prime numbers."}],
"temperature": 0.9
}'
Expected behavior
To return a code snippet
Logs
» docker run -ti --platform linux/amd64 -p 8080:8080 localai/localai:v2.8.2-ffmpeg-core codellama-7b-gguf --debug 125 ↵
@@@@@
Skipping rebuild
@@@@@
If you are experiencing issues with the pre-compiled builds, try setting REBUILD=true
If you are still experiencing issues with the build, try setting CMAKE_ARGS and disable the instructions set as needed:
CMAKE_ARGS="-DLLAMA_F16C=OFF -DLLAMA_AVX512=OFF -DLLAMA_AVX2=OFF -DLLAMA_FMA=OFF"
see the documentation at: https://localai.io/basics/build/index.html
Note: See also https://github.com/go-skynet/LocalAI/issues/288
@@@@@
CPU info:
CPU: no AVX found
CPU: no AVX2 found
CPU: no AVX512 found
@@@@@
6:55PM DBG no galleries to load
6:55PM INF Starting LocalAI using 4 threads, with models path: /build/models
6:55PM INF LocalAI version: v2.8.2 (e690bf387a27de277368e2f742a616e1b2600d5b)
6:55PM WRN [startup] failed resolving model '--debug'
6:55PM INF Preloading models from /build/models
6:55PM INF Downloading "https://huggingface.co/TheBloke/CodeLlama-7B-GGUF/resolve/main/codellama-7b.Q4_K_M.gguf"
6:55PM INF Downloading /build/models/232692e1614183192beee756c58afefc.partial: 321.4 MiB/3.8 GiB (8.26%) ETA: 55.553743882s
6:55PM INF Downloading /build/models/232692e1614183192beee756c58afefc.partial: 711.8 MiB/3.8 GiB (18.29%) ETA: 44.680439228s
6:55PM INF Downloading /build/models/232692e1614183192beee756c58afefc.partial: 1.0 GiB/3.8 GiB (27.58%) ETA: 39.397551322s
6:55PM INF Downloading /build/models/232692e1614183192beee756c58afefc.partial: 1.4 GiB/3.8 GiB (37.36%) ETA: 33.538813589s
6:55PM INF Downloading /build/models/232692e1614183192beee756c58afefc.partial: 1.8 GiB/3.8 GiB (47.02%) ETA: 28.171281943s
6:55PM INF Downloading /build/models/232692e1614183192beee756c58afefc.partial: 2.2 GiB/3.8 GiB (56.74%) ETA: 22.877781731s
6:55PM INF Downloading /build/models/232692e1614183192beee756c58afefc.partial: 2.5 GiB/3.8 GiB (66.86%) ETA: 17.34760848s
6:55PM INF Downloading /build/models/232692e1614183192beee756c58afefc.partial: 2.9 GiB/3.8 GiB (76.16%) ETA: 12.522402138s
6:55PM INF Downloading /build/models/232692e1614183192beee756c58afefc.partial: 3.3 GiB/3.8 GiB (86.14%) ETA: 7.240267439s
6:55PM INF Downloading /build/models/232692e1614183192beee756c58afefc.partial: 3.6 GiB/3.8 GiB (95.45%) ETA: 2.38336908s
6:55PM INF File "/build/models/232692e1614183192beee756c58afefc" downloaded and verified
6:55PM INF Model name: codellama-7b-gguf
6:55PM INF Model usage:
curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{
"model": "codellama-7b-gguf",
"prompt": "import socket\n\ndef ping_exponential_backoff(host: str):"
}'
┌───────────────────────────────────────────────────┐
│ Fiber v2.50.0 │
│ http://127.0.0.1:8080 │
│ (bound on host 0.0.0.0 and port 8080) │
│ │
│ Handlers ............ 73 Processes ........... 1 │
│ Prefork ....... Disabled PID ................ 53 │
└───────────────────────────────────────────────────┘
6:55PM INF Loading model '232692e1614183192beee756c58afefc' with backend transformers
6:56PM ERR Failed starting/connecting to the gRPC service: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:32949: connect: connection refused"
Additional context
This is my first time trying to start the project.
virdb