Skip to content

violent crash on Mac Mini M2 8GB RAM when trying to use GPU #2141

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
siddhsql opened this issue Jul 7, 2023 · 6 comments
Closed

violent crash on Mac Mini M2 8GB RAM when trying to use GPU #2141

siddhsql opened this issue Jul 7, 2023 · 6 comments
Labels

Comments

@siddhsql
Copy link

siddhsql commented Jul 7, 2023

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • [ x] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • [ x] I carefully followed the README.md.
  • [ x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • [ x] I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

I have a M2 Mac Mini with 8 GB unified memory. I tried to run llama.cpp as explained here: #1642

Current Behavior

My computer froze and rebooted after sometime. I got a brief flash of a pink screen of death. I re-tried several times and got the same behavior. Once instead of crashing, I got an assert in following code:

for (int i = 0; i < n_cb; i++) {
        MTLCommandBufferStatus status = (MTLCommandBufferStatus) [command_buffers[i] status];
        if (status != MTLCommandBufferStatusCompleted) {
            fprintf(stderr, "%s: command buffer %d failed with status %lu\n", __func__, i, status);
            GGML_ASSERT(false);
        }
    }

Each time I was able to see console output saying its trying to load GPU buffers similar to what we see in the video on #1642.

The model I was trying is gpt4-x-vicuna-13B.ggmlv3.q5_K_M.bin

MODEL=gpt4-x-vicuna-13B.ggmlv3.q5_K_M.bin
CONTEXT_SIZE=2048
PROMPT="$SCRIPT_DIR/../prompts/chat-with-bob.txt"

cd $SCRIPT_DIR/../build/release/bin
set -x
./main \
-m "$MODEL" \
 -c $CONTEXT_SIZE \
 --repeat_penalty 1.0 \
--color \
-i \
-r "User:" \
--in-prefix " " \
-f "$PROMPT"

over here I see a funny comment:

yes - this is fixed now that this crashes instead of giving bad output

how is crashing acceptable behaviour?

Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

  • Physical (or virtual) hardware you are using, e.g. for Linux:

M2 Mac Mini w/ 8 GB memory

  • Operating System, e.g. for Linux:
± uname -a
Darwin 22.5.0 Darwin Kernel Version 22.5.0: Thu Jun  8 22:21:34 PDT 2023; root:xnu-8796.121.3~7/RELEASE_ARM64_T8112 arm64
  • SDK version, e.g. for Linux:
$ python3 --version
$ make --version
$ g++ --version

Failure Information (for bugs)

Please help provide information about the failure if this is a bug. If it is not a bug, please remove the rest of this template.

Steps to Reproduce

see above.

Failure Logs

@siddhsql siddhsql changed the title violent crash on Mac Mini M2 8GB RAM violent crash on Mac Mini M2 8GB RAM when trying to use GPU Jul 7, 2023
@BarfingLemurs
Copy link
Contributor

@siddhsql you have a simple problem, you dont have enough ram (only 8gb) to run 13b models. see the sizes in the readme. you can run a 7b model fine.

@siddhsql
Copy link
Author

siddhsql commented Jul 8, 2023 via email

@philipturner
Copy link

Whenever there's an infinite loop, this behavior is typical of Apple Mac GPUs. The GPU is frozen and won't exit the command. After you reset the computer nothing is broken, looks much scarier than it is.

@siddhsql
Copy link
Author

siddhsql commented Aug 2, 2023 via email

@philipturner
Copy link

It doesn't have to be an infinite loop. Sometimes, giving it a compute workload too heavy causes the GPU to go rogue as well. Often there is some kind of fault that happened internally, for example an out-of-bounds memory access.

On iOS, the GPU goes rogue less often, because a watchdog aborts very long command buffers. On Mac, you have to restart the entire computer. Mac is probably this way to give more flexibility (don't have to actively check whether all your command buffers will fall under 100 ms).

@github-actions github-actions bot added the stale label Mar 25, 2024
Copy link
Contributor

github-actions bot commented Apr 9, 2024

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions github-actions bot closed this as completed Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants