ERR Failed starting/connecting to the gRPC service

**LocalAI version:**
2.8.2

**Environment, CPU architecture, OS, and Version:**
```
» uname -a
Darwin Dougs-MacBook-Air.local 23.3.0 Darwin Kernel Version 23.3.0: Wed Dec 20 21:33:31 PST 2023; root:xnu-10002.81.5~7/RELEASE_ARM64_T8112 arm64
```

**Describe the bug**
```
ERR Failed starting/connecting to the gRPC service: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:35053: connect: connection refused"
```

**To Reproduce**
```
» docker run -ti --platform linux/amd64 -p 8080:8080 localai/localai:v2.8.2-ffmpeg-core codellama-7b-gguf
» curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
     "model": "codellama-7b-gguf",
     "messages": [{"role": "user", "content": "Please write a function that calculates the first n prime numbers."}],
     "temperature": 0.9
   }'
```

**Expected behavior**
To return a code snippet

**Logs**
```
» docker run -ti --platform linux/amd64 -p 8080:8080 localai/localai:v2.8.2-ffmpeg-core codellama-7b-gguf --debug            125 ↵
@@@@@
Skipping rebuild
@@@@@
If you are experiencing issues with the pre-compiled builds, try setting REBUILD=true
If you are still experiencing issues with the build, try setting CMAKE_ARGS and disable the instructions set as needed:
CMAKE_ARGS="-DLLAMA_F16C=OFF -DLLAMA_AVX512=OFF -DLLAMA_AVX2=OFF -DLLAMA_FMA=OFF"
see the documentation at: https://localai.io/basics/build/index.html
Note: See also https://github.com/go-skynet/LocalAI/issues/288
@@@@@
CPU info:
CPU: no AVX    found
CPU: no AVX2   found
CPU: no AVX512 found
@@@@@
6:55PM DBG no galleries to load
6:55PM INF Starting LocalAI using 4 threads, with models path: /build/models
6:55PM INF LocalAI version: v2.8.2 (e690bf387a27de277368e2f742a616e1b2600d5b)
6:55PM WRN [startup] failed resolving model '--debug'
6:55PM INF Preloading models from /build/models
6:55PM INF Downloading "https://huggingface.co/TheBloke/CodeLlama-7B-GGUF/resolve/main/codellama-7b.Q4_K_M.gguf"
6:55PM INF Downloading /build/models/232692e1614183192beee756c58afefc.partial: 321.4 MiB/3.8 GiB (8.26%) ETA: 55.553743882s
6:55PM INF Downloading /build/models/232692e1614183192beee756c58afefc.partial: 711.8 MiB/3.8 GiB (18.29%) ETA: 44.680439228s
6:55PM INF Downloading /build/models/232692e1614183192beee756c58afefc.partial: 1.0 GiB/3.8 GiB (27.58%) ETA: 39.397551322s
6:55PM INF Downloading /build/models/232692e1614183192beee756c58afefc.partial: 1.4 GiB/3.8 GiB (37.36%) ETA: 33.538813589s
6:55PM INF Downloading /build/models/232692e1614183192beee756c58afefc.partial: 1.8 GiB/3.8 GiB (47.02%) ETA: 28.171281943s
6:55PM INF Downloading /build/models/232692e1614183192beee756c58afefc.partial: 2.2 GiB/3.8 GiB (56.74%) ETA: 22.877781731s
6:55PM INF Downloading /build/models/232692e1614183192beee756c58afefc.partial: 2.5 GiB/3.8 GiB (66.86%) ETA: 17.34760848s
6:55PM INF Downloading /build/models/232692e1614183192beee756c58afefc.partial: 2.9 GiB/3.8 GiB (76.16%) ETA: 12.522402138s
6:55PM INF Downloading /build/models/232692e1614183192beee756c58afefc.partial: 3.3 GiB/3.8 GiB (86.14%) ETA: 7.240267439s
6:55PM INF Downloading /build/models/232692e1614183192beee756c58afefc.partial: 3.6 GiB/3.8 GiB (95.45%) ETA: 2.38336908s
6:55PM INF File "/build/models/232692e1614183192beee756c58afefc" downloaded and verified
6:55PM INF Model name: codellama-7b-gguf
6:55PM INF Model usage:
curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{
    "model": "codellama-7b-gguf",
    "prompt": "import socket\n\ndef ping_exponential_backoff(host: str):"
}'

 ┌───────────────────────────────────────────────────┐
 │                   Fiber v2.50.0                   │
 │               http://127.0.0.1:8080               │
 │       (bound on host 0.0.0.0 and port 8080)       │
 │                                                   │
 │ Handlers ............ 73  Processes ........... 1 │
 │ Prefork ....... Disabled  PID ................ 53 │
 └───────────────────────────────────────────────────┘

6:55PM INF Loading model '232692e1614183192beee756c58afefc' with backend transformers
6:56PM ERR Failed starting/connecting to the gRPC service: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:32949: connect: connection refused"
```

**Additional context**
This is my first time trying to start the project.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ERR Failed starting/connecting to the gRPC service #1721

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

ERR Failed starting/connecting to the gRPC service #1721

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions