Fix usage of F16C intrinsics in AVX code #563

slaren · 2023-03-27T21:38:28Z

Sandy bridge supports AVX but not F16C. Should fix #562

@kaufmannr can you verify if this fixes your problem?

ggml.c

kaufmannr

I can confirm that

#if defined(F16C)
// the _mm256_cvt intrinsics require F16C
#define GGML_F32Cx8_LOAD(x) _mm256_cvtph_ps(_mm_loadu_si128((__m128i *)(x)))
#define GGML_F32Cx8_STORE(x, y) _mm_storeu_si128((__m128i *)(x), _mm256_cvtps_ph(y, 0))
#else
static inline __m256 __sse_f16x8_load(ggml_fp16_t *x) {
float tmp[8];

for (int i = 0; i < 8; i++)
    tmp[i] = GGML_FP16_TO_FP32(x[i]);

return _mm256_loadu_ps(tmp);

}

static inline void __sse_f16x8_store(ggml_fp16_t *x, __m256 y) {
float arr[8];

_mm256_storeu_ps(arr, y);

for (int i = 0; i < 8; i++)
    x[i] = GGML_FP16_TO_FP32(arr[i]);

}
#define GGML_F32Cx8_LOAD(x) __sse_f16x8_load(x)
#define GGML_F32Cx8_STORE(x, y) __sse_f16x8_store(x, y)
#endif

compiles on the i5-2400 but was not able to test the program itself due to other issues.

anzz1

This should be resolved before merging. I couldn't edit the thing above so the post above is what I am referring to.

anzz1 · 2023-03-28T14:25:19Z

There is one more problem regarding to this though, is that Makefile is configured to check for the processor features but CMake always declares it no matter what:
https://github.com/ggerganov/llama.cpp/blob/7e5395575a3360598f2565c73c8a2ec0c0abbdb8/CMakeLists.txt#L200

I'm not necessarily saying that CMake should be changed to check for the features but it should print information of the enabled features or a disclaimer of some sort to reduce the amount of trouble people are having.

Probably adding a option to CMakeLists for F16C in non-Windows-MSVC builds like there already are for FMA/AVX/etc. would be the right choice.

Any way, that's a problem for another day. Good that this one is fixed now. We can simply instruct people to use make as a primary option for now.

edit: #576 should fix this.

ggml.c

Fix usage of F16C intrinsics in AVX code

ab6ac3d

slaren mentioned this pull request Mar 27, 2023

avx on Core i5-2400 #562

Closed

anzz1 reviewed Mar 28, 2023

View reviewed changes

ggml.c Show resolved Hide resolved

kaufmannr approved these changes Mar 28, 2023

View reviewed changes

anzz1 suggested changes Mar 28, 2023

View reviewed changes

Use more accurate function names

911782c

anzz1 approved these changes Mar 28, 2023

View reviewed changes

anzz1 merged commit a6bdc47 into ggml-org:master Mar 28, 2023

anzz1 mentioned this pull request Mar 28, 2023

CMake: Add explicit F16C option (x86) #576

Merged

slaren deleted the fix-f16c branch March 28, 2023 17:05

slaren mentioned this pull request Mar 29, 2023

Error: inlining failed in call to ‘always_inline’ ‘_mm256_cvtph_ps’ on x86_64 - better support for different x86_64 CPU instruction extensions #196

Closed

polkovnikov reviewed Mar 30, 2023

View reviewed changes

ggml.c Show resolved Hide resolved

RiccaDS mentioned this pull request Mar 30, 2023

Make issue. Maybe flags? cocktailpeanut/dalai#302

Open

sw mentioned this pull request May 6, 2023

making on linuxmint 21 #208

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix usage of F16C intrinsics in AVX code #563

Fix usage of F16C intrinsics in AVX code #563

Uh oh!

slaren commented Mar 27, 2023

Uh oh!

Uh oh!

kaufmannr left a comment

Uh oh!

anzz1 left a comment •

edited

Loading

Uh oh!

anzz1 commented Mar 28, 2023 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Fix usage of F16C intrinsics in AVX code #563

Fix usage of F16C intrinsics in AVX code #563

Uh oh!

Conversation

slaren commented Mar 27, 2023

Uh oh!

Uh oh!

kaufmannr left a comment

Choose a reason for hiding this comment

Uh oh!

anzz1 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anzz1 commented Mar 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anzz1 left a comment •

edited

Loading

anzz1 commented Mar 28, 2023 •

edited

Loading