Fix MacOS Sonoma model quantization #4052

TortoiseHam · 2023-11-12T23:14:26Z

A PR to resolve the Issue: #3983

I'm not sure why unrolling this particular for-loop in the -O3 compiler is causing problems, but by marking the iterator index as volatile to prevent that particular unroll I can now successfully quantize the 70B LLama2 model again on my M1 MBP.

cebtenzzre · 2023-11-12T23:35:05Z

Could we do this in a way that doesn't affect other platforms?

TortoiseHam · 2023-11-12T23:38:25Z

I don't think this will have any tangible impact on the performance of other platforms. This loop should be pretty fast regardless of the compiler's ability to unroll it compared to the total time taken to quantize a model. If anyone has other platforms handy to do a speed test that'd be helpful to verify though

TortoiseHam · 2023-11-12T23:44:05Z

I suppose it would be possible to duplicate 2 copies of the loop and put one of the copies inside an IFDEF MACOS or similar, but that seems like it would be ugly and harder to maintain

cebtenzzre · 2023-11-13T00:10:55Z

Does it work if you do something like this?

#if ... // clang on macOS Sonoma
#pragma clang loop unroll(disable)
#endif
for (...)

I think it's better to be explicit about what problem we are working around so we aren't afraid of breaking anyone else when it comes time to remove it.

Has anyone tried gcc to see if the same issue happens?

edit: It seems that gcc's objective C compiler is not compatible with the macOS system headers.

TortoiseHam · 2023-11-13T00:20:47Z

I tried #pragma loop unroll 1 originally but the 'suggestion' wasn't strong enough to convince the compiler not to unroll.

Edit: Also just tried #pragma clang loop unroll(disable) to be sure, and it likewise doesn't do the job

ggml-quants.c

Co-authored-by: Jared Van Bortel <[email protected]>

TortoiseHam · 2023-11-13T00:56:22Z

sure, i've updated w/ the suggested guards. That works on my end too

ggml-quants.c

ggerganov

Thanks for looking into this! So it does seem like an issue with clang.
Would be nice if we can make a minimal standalone repro and submit it.
At least we have a workaround for ggml now which is nice

ggml-quants.c

Co-authored-by: Georgi Gerganov <[email protected]>

cebtenzzre · 2023-11-13T17:47:25Z

I was able to build quantize using gcc on master:

$ gcc-13 --version
gcc-13 (Homebrew GCC 13.2.0) 13.2.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ export CC=gcc-13 CXX=gcc-13
$ cmake -B build -DCMAKE_BUILD_TYPE=Release -DLLAMA_METAL=OFF -DLLAMA_ACCELERATE=OFF
$ make -C build quantize

And it does in fact quantize without crashing.

cebtenzzre

As mentioned in #3983 this appears to be an issue with the Apple linker (dyld). I want to see if I can reproduce into this issue if I use clang from homebrew. The preprocessor check we're using may not be accurate.

cebtenzzre · 2023-11-14T04:21:31Z

Homebrew clang uses /usr/bin/ld by default, but with -fuse-ld=lld it will use the LLVM linker from Homebrew which produced a functioning binary for me.

also fix an issue with CMake <3.26 compatibility

Co-authored-by: Jared Van Bortel <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]>

Update ggml-quants.c

bf1c8d7

TortoiseHam mentioned this pull request Nov 12, 2023

Can't Quantize gguf files: zsh: illegal hardware instruction on M1 MacBook Pro #3983

Closed

cebtenzzre reviewed Nov 13, 2023

View reviewed changes

ggml-quants.c Outdated Show resolved Hide resolved

TortoiseHam and others added 2 commits November 12, 2023 16:52

Update ggml-quants.c

7a5e92e

Co-authored-by: Jared Van Bortel <[email protected]>

Update ggml-quants.c

287bc68

cebtenzzre reviewed Nov 13, 2023

View reviewed changes

ggml-quants.c Outdated Show resolved Hide resolved

increase indentation per 4-spaces rule

5b0d76f

ggerganov approved these changes Nov 13, 2023

View reviewed changes

ggml-quants.c Outdated Show resolved Hide resolved

Update ggml-quants.c

63c950c

Co-authored-by: Georgi Gerganov <[email protected]>

cebtenzzre requested changes Nov 13, 2023

View reviewed changes

detect linker version instead of compiler version

7962d0a

cebtenzzre approved these changes Nov 14, 2023

View reviewed changes

cebtenzzre requested a review from ggerganov November 14, 2023 05:05

use a more conventional macro name

512d974

also fix an issue with CMake <3.26 compatibility

ggerganov approved these changes Nov 14, 2023

View reviewed changes

cebtenzzre merged commit 6bb4908 into ggml-org:master Nov 14, 2023

KerfuffleV2 pushed a commit to KerfuffleV2/llama.cpp that referenced this pull request Nov 17, 2023

Fix MacOS Sonoma model quantization (ggml-org#4052)

affa88b

Co-authored-by: Jared Van Bortel <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]>

olexiyb pushed a commit to Sanctum-AI/llama.cpp that referenced this pull request Nov 23, 2023

Fix MacOS Sonoma model quantization (ggml-org#4052)

2c48b64

Co-authored-by: Jared Van Bortel <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]>

ggerganov mentioned this pull request Jan 16, 2024

ggml : add IQ2 to test-backend-ops + refactoring #4990

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix MacOS Sonoma model quantization #4052

Fix MacOS Sonoma model quantization #4052

Uh oh!

TortoiseHam commented Nov 12, 2023

Uh oh!

cebtenzzre commented Nov 12, 2023

Uh oh!

TortoiseHam commented Nov 12, 2023

Uh oh!

TortoiseHam commented Nov 12, 2023

Uh oh!

cebtenzzre commented Nov 13, 2023 •

edited

Loading

Uh oh!

TortoiseHam commented Nov 13, 2023 •

edited

Loading

Uh oh!

Uh oh!

TortoiseHam commented Nov 13, 2023

Uh oh!

Uh oh!

ggerganov left a comment

Uh oh!

Uh oh!

cebtenzzre commented Nov 13, 2023

Uh oh!

cebtenzzre left a comment •

edited

Loading

Uh oh!

cebtenzzre commented Nov 14, 2023

Uh oh!

Uh oh!

Fix MacOS Sonoma model quantization #4052

Fix MacOS Sonoma model quantization #4052

Uh oh!

Conversation

TortoiseHam commented Nov 12, 2023

Uh oh!

cebtenzzre commented Nov 12, 2023

Uh oh!

TortoiseHam commented Nov 12, 2023

Uh oh!

TortoiseHam commented Nov 12, 2023

Uh oh!

cebtenzzre commented Nov 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TortoiseHam commented Nov 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

TortoiseHam commented Nov 13, 2023

Uh oh!

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cebtenzzre commented Nov 13, 2023

Uh oh!

cebtenzzre left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cebtenzzre commented Nov 14, 2023

Uh oh!

Uh oh!

cebtenzzre commented Nov 13, 2023 •

edited

Loading

TortoiseHam commented Nov 13, 2023 •

edited

Loading

cebtenzzre left a comment •

edited

Loading