Skip to content

llama-cpp-python not using GPU on m1 #756

Open
@agunapal

Description

@agunapal

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

I am using llama-cpp-python on M1 mac .
I installed using the cmake flag as mentioned in README.

I am able to run inference, but I am noticing that its mostly using CPU .
I expected it to use GPU
Screenshot 2023-09-26 at 11 59 38 AM

How do I make sure llama-cpp-python is using GPU on m1 mac?

Current Behavior

I am using llama-cpp-python on M1 mac .
I installed using the cmake flag as mentioned in README.

I am able to run inference, but I am noticing that its mostly using CPU .

Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

M1 Mac
llama_cpp_python          0.2.6
Python 3.9.0
macOS 13.5.2

Steps to Reproduce

Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.

  1. Load llama2-7b 4-bit quantization model
  2. Run inference

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingbuild

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions