-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Description
LocalAI version:
v1.40.0-cublas-cuda11
Environment, CPU architecture, OS, and Version:
Linux pop-os 6.0.12-76060012-generic #202212290932167406645920.04~3cd2bf3-Ubuntu SMP PREEMPT_DYNAMI x86_64 x86_64 x86_64 GNU/Linux
Describe the bug
I have installed successfully a model, by running:
curl http://localhost:8080/models/apply -H "Content-Type: application/json" -d '{
"id": "huggingface@TheBloke/Yarn-Mistral-7B-128k-GGUF/yarn-mistral-7b-128k.Q5_K_M.gguf"
}'
After that it just keeps hanging and no response was comming:
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "thebloke__yarn-mistral-7b-128k-gguf__yarn-mistral-7b-128k.q5_k_m.gguf",
"messages": [{"role": "user", "content": "Say this is a test!"}],
"temperature": 0.1
}'
To Reproduce
Run docker container as follows:
docker run -d --rm
--name api
--gpus '"device=1"'
-p 8080:8080
--env-file .env
-v $(pwd)/models:/models
-v $(pwd)/images:/tmp/generated/images
quay.io/go-skynet/local-ai:v1.40.0-cublas-cuda11
/usr/bin/local-ai
Expected behavior
I'm expecting to get a response, but nothing is happening.
Logs
logs.txt
Additional context
Here is my .env:
GALLERIES=[{"name":"model-gallery", "url":"github:go-skynet/model-gallery/index.yaml"}, {"url": "github:go-skynet/model-gallery/huggingface.yaml","name":"huggingface"}]
MODELS_PATH=/models
DEBUG=true
BUILD_TYPE=cublas
I have also tried running without GPU, but the same keeps happening. Any idea on this?