-
Notifications
You must be signed in to change notification settings - Fork 286
Closed
Description
Windows 11 fully updated.
WSL2 updated.
Docker Desktop for Windows latest , GPU works in docker. nvidia-sli reports the GPU fine.
Nvidia Cuda 12.2 Toolkit
Newest Nvidia drivers.
RTX 3080 10GB VRAM.
AMD R5800X3D 32GB RAM
No other GPU software running.
At first all looks good, model loads and is serving, but after some time memory utilization grows to 10GB and then GPU load stays at 100% for prolonged times, model times out I can only restart the docker container to fix it.
Actually it rises to 10GB of VRAM use pretty quickly. This is for 1.6B Refact.ai model.
Docker runs 'thenlper/gte-base' as well. When I delete it to gain a little VRAM, the responsiveness comes back for just a couple of queries more.
JetBrains IDEA Refact.AI plugin.
Metadata
Metadata
Assignees
Labels
No labels