Eval bug: On macOS 12 / 13 metal crashes after commit 0f0a3c28

### Name and Version

When running tests against the latest llama.cpp, I noticed crashes on both macOS 12 and 13 (Ventura). 

It's either old MacOS version or small VRAM or maybe both that causes the problem.  To make it easier to spot the exact location of the failure and resulting crash, I added the following diff

```
diff --git a/ggml/src/ggml-metal/ggml-metal-context.m b/ggml/src/ggml-metal/ggml-metal-context.m
index af9ff2143..e327fc152 100644
--- a/ggml/src/ggml-metal/ggml-metal-context.m
+++ b/ggml/src/ggml-metal/ggml-metal-context.m
@@ -294,10 +294,12 @@ void ggml_metal_set_tensor_async(ggml_metal_t ctx, struct ggml_tensor * tensor,
 
 void ggml_metal_get_tensor_async(ggml_metal_t ctx, const struct ggml_tensor * tensor, void * data, size_t offset, size_t size) {
     @autoreleasepool {
+        GGML_LOG_INFO("%s XXX calling newBufferWithBytesNoCopy data:%p size:%llu\n", __func__, data, size);
         id<MTLBuffer> buf_dst = [ctx->device newBufferWithBytesNoCopy:data
                                                                length:size
                                                               options:MTLResourceStorageModeShared
                                                           deallocator:nil];
+        GGML_ASSERT(buf_dst != nil && "newBufferWithBytesNoCopy failed");
 
         struct ggml_metal_buffer_id bid_src = ggml_metal_get_buffer_id(tensor);
         if (bid_src.metal == nil) {
```

To build a version compatible with older MacOS

```
export SDKROOT=/Applications/Xcode_14.1.0.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk
export DEVELOPER_DIR=/Applications/Xcode_14.1.0.app/Contents/Developer
```

then build with

```
cmake -B build -DCMAKE_OSX_DEPLOYMENT_TARGET=12.0
cmake --build build --parallel 8
``` 

Copy the binaries to a MacOS v12 or v13 system with 16G (or 8G)
```
./llama-cli -m <path to llama3.2 or qwen3>
...
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
ggml_metal_get_tensor_async XXX calling newBufferWithBytesNoCopy data:0x10a910000 size:607744
/Users/ollama/code/llama.cpp/ggml/src/ggml-metal/ggml-metal-context.m:302: GGML_ASSERT(buf_dst != nil && "newBufferWithBytesNoCopy failed") failed
```

If you add `--gpu-layers XX` with 1 less than the full load, then the ggml_metal_get_tensor_async code doesn't run, it doesn't crash, and the model works properly. 

### Operating systems

Mac

### GGML backends

Metal

### Hardware

tested on Apple M1 Mac mini

### Models

tested on llama3.2, qwen3

### Problem description & steps to reproduce

```
./llama-cli -m <path to llama3.2 or qwen3>
...
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
ggml_metal_get_tensor_async XXX calling newBufferWithBytesNoCopy data:0x10a910000 size:607744
/Users/ollama/code/llama.cpp/ggml/src/ggml-metal/ggml-metal-context.m:302: GGML_ASSERT(buf_dst != nil && "newBufferWithBytesNoCopy failed") failed
```

### First Bad Commit

https://github.com/ggml-org/llama.cpp/commit/0f0a3c2851134d49955f3c85afbb0b1bb47c3e07

### Relevant log output

```shell
/Users/ollama/code/llama.cpp/ggml/src/ggml-metal/ggml-metal-context.m:302: GGML_ASSERT(buf_dst != nil && "newBufferWithBytesNoCopy failed") failed
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Eval bug: On macOS 12 / 13 metal crashes after commit 0f0a3c28 #16266

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Eval bug: On macOS 12 / 13 metal crashes after commit 0f0a3c28 #16266

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions