Skip to content

AI Labs Service doesn't use GPU #3431

@gastoner

Description

@gastoner

Bug description

Hi.

I startet to use Podman as a replacement to docker. I realy love it.
I also discovered that there is the AI Lab to host models. So I started using it. Everything works, except the GPU support. When I create a service from a model it pull the image and start a container. When I then use the container to chat with a model it only uses the CPU.

Image

I have followed the steps on https://podman-desktop.io/docs/podman/gpu and get this output:

Image

But when I create a service from a model it doesn't use the gpu.
This also happens when I start the LLama stack and use it. It doesn't use the gpu.

Operating system

w11

Installation Method

from Podman-Desktop extension page

Version

1.2.x

Steps to reproduce

Just create a service from a model in the AI lab and try to use it.

Relevant log output

gml_cuda_init: failed to initialize CUDA: no CUDA-capable device is detected
build: 5985 (3f4fc97f) with cc (GCC) 12.2.1 20221121 (Red Hat 12.2.1-7) for x86_64-redhat-linux
system info: n_threads = 8, n_threads_batch = 8, total_threads = 16

system_info: n_threads = 8 (n_threads_batch = 8) / 16 | CUDA : ARCHS = 500,610,700,750,800,860,890 | USE_GRAPHS = 1 | PEER_MAX_BATCH_SIZE = 128 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | BMI2 = 1 | LLAMAFILE = 1 | OPENMP = 1 | REPACK = 1 | 

main: binding port with default address family
main: HTTP server is listening, hostname: 0.0.0.0, port: 8000, http threads: 15
main: loading m

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions