Skip to content

Commit 5a5f8b1

Browse files
authored
Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375)
* Enable Fused-Multiply-Add (FMA) instructions on MSVC __FMA__ macro does not exist in MSVC * Enable F16C/CVT16 vector extensions on MSVC __F16C__ macro does not exist in MSVC, but is implied with AVX2/AVX512 * MSVC cvt intrinsics * Add __SSE3__ macro for MSVC too because why not even though it's not currently used for anything when AVX is defined
1 parent f121705 commit 5a5f8b1

File tree

1 file changed

+18
-0
lines changed

1 file changed

+18
-0
lines changed

ggml.c

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,19 @@ static int sched_yield (void) {
7979
typedef void* thread_ret_t;
8080
#endif
8181

82+
// __FMA__ and __F16C__ are not defined in MSVC, however they are implied with AVX2/AVX512
83+
#if defined(_MSC_VER) && (defined(__AVX2__) || defined(__AVX512F__))
84+
#ifndef __FMA__
85+
#define __FMA__
86+
#endif
87+
#ifndef __F16C__
88+
#define __F16C__
89+
#endif
90+
#ifndef __SSE3__
91+
#define __SSE3__
92+
#endif
93+
#endif
94+
8295
#ifdef __HAIKU__
8396
#define static_assert(cond, msg) _Static_assert(cond, msg)
8497
#endif
@@ -172,8 +185,13 @@ typedef double ggml_float;
172185

173186
#ifdef __F16C__
174187

188+
#ifdef _MSC_VER
189+
#define GGML_COMPUTE_FP16_TO_FP32(x) _mm_cvtss_f32(_mm_cvtph_ps(_mm_cvtsi32_si128(x)))
190+
#define GGML_COMPUTE_FP32_TO_FP16(x) _mm_extract_epi16(_mm_cvtps_ph(_mm_set_ss(x), 0), 0)
191+
#else
175192
#define GGML_COMPUTE_FP16_TO_FP32(x) _cvtsh_ss(x)
176193
#define GGML_COMPUTE_FP32_TO_FP16(x) _cvtss_sh(x, 0)
194+
#endif
177195

178196
#elif defined(__POWER9_VECTOR__)
179197

0 commit comments

Comments
 (0)