-
Notifications
You must be signed in to change notification settings - Fork 12.9k
Description
Adding it as bug report, not sure if the PR is still watched after merge.
Since PR #4205 there is a segfault on windows and on WSL using llava-cli (in clip.cpp)
It looks like a heap corruption that is triggered as soon as ctx0 in the build_graph is free()'d but already happened before (removing the free only shifts the segfault location.
- I tried two different standard llava models, both worked with the previous clip.
- I tried the precompiled release exe as well as compilations on WSL and on Windows
- I re-converted a llava model from start - just to make sure
Happens with CPU and GPU mode, maybe it's in the way memory buffer sizes are measured. That has significantly changed.
Seems related with the new backend buffer
Example command:
./bin/llava-cli -m /mnt/q/models/llava/liuhaotianllava-v1.5-7b/ggml-model-q3_k --mmproj /mnt/q/models/llava/liuhaotianllava-v1.5-7b/mmproj-model-f16.gguf --image /mnt/c/temp/tmp.png
clip_model_load: model name: openai/clip-vit-large-patch14-336
clip_model_load: description: image encoder for LLaVA
clip_model_load: GGUF version: 2
clip_model_load: alignment: 32
clip_model_load: n_tensors: 377
clip_model_load: n_kv: 18
clip_model_load: ftype: f16
clip_model_load: CLIP using CPU backend
clip_model_load: text_encoder: 0
clip_model_load: vision_encoder: 1
clip_model_load: llava_projector: 1
clip_model_load: model size: 595.53 MB
clip_model_load: metadata size: 0.14 MB
clip_model_load: params backend buffer size = 595.53 MB (377 tensors)
Segmentation fault
Valgrind didn't show more than what I had seen, the free is causing an issue.
==4434== Warning: set address range perms: large range [0x5184040, 0x2a50c500) (undefined)
==4434== Invalid read of size 8
==4434== at 0x217C15: ggml_allocr_alloc_graph (in /mnt/q/vanilla/llama.cpp/build_linux/bin/llava-cli)
==4434== by 0x14CA6A: clip_model_load (in /mnt/q/vanilla/llama.cpp/build_linux/bin/llava-cli)
==4434== by 0x11F39A: main (in /mnt/q/vanilla/llama.cpp/build_linux/bin/llava-cli)
==4434== Address 0x4ec98b8 is 72 bytes inside a block of size 884,880 free'd
==4434== at 0x48399AB: free (vg_replace_malloc.c:538)
==4434== by 0x1F2DC8: ggml_free (in /mnt/q/vanilla/llama.cpp/build_linux/bin/llava-cli)
==4434== by 0x126928: clip_image_build_graph(clip_ctx const*, clip_image_f32_batch const*) (in /mnt/q/vanilla/llama.cpp/build_linux/bin/llava-cli)
==4434== by 0x14CA5B: clip_model_load (in /mnt/q/vanilla/llama.cpp/build_linux/bin/llava-cli)
==4434== by 0x11F39A: main (in /mnt/q/vanilla/llama.cpp/build_linux/bin/llava-cli)
==4434== Block was alloc'd at
==4434== at 0x483AEB8: memalign (vg_replace_malloc.c:906)
==4434== by 0x483AFCE: posix_memalign (vg_replace_malloc.c:1070)
==4434== by 0x1F29FB: ggml_init (in /mnt/q/vanilla/llama.cpp/build_linux/bin/llava-cli)
==4434== by 0x125D8C: clip_image_build_graph(clip_ctx const*, clip_image_f32_batch const*) (in /mnt/q/vanilla/llama.cpp/build_linux/bin/llava-cli)
==4434== by 0x14CA5B: clip_model_load (in /mnt/q/vanilla/llama.cpp/build_linux/bin/llava-cli)
==4434== by 0x11F39A: main (in /mnt/q/vanilla/llama.cpp/build_linux/bin/llava-cli)
I'm a bit irritated given I can reproduce it on two "platforms" and with a clean rebuild but that's clearly something that has been present for like a week or two, so it does not appear to happen for everyone.