Skip to content

VRAM memory leak for Refact.AI 1.6B #332

@tawek

Description

@tawek

Windows 11 fully updated.
WSL2 updated.
Docker Desktop for Windows latest , GPU works in docker. nvidia-sli reports the GPU fine.
Nvidia Cuda 12.2 Toolkit
Newest Nvidia drivers.
RTX 3080 10GB VRAM.
AMD R5800X3D 32GB RAM
No other GPU software running.

At first all looks good, model loads and is serving, but after some time memory utilization grows to 10GB and then GPU load stays at 100% for prolonged times, model times out I can only restart the docker container to fix it.
Actually it rises to 10GB of VRAM use pretty quickly. This is for 1.6B Refact.ai model.

Docker runs 'thenlper/gte-base' as well. When I delete it to gain a little VRAM, the responsiveness comes back for just a couple of queries more.

JetBrains IDEA Refact.AI plugin.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions