Commit e767397

committed

ggml : rewrite silu and softmax for cpu

This change upstreams llamafile's vectorized expf() functions. This lets us compute softmax and silu more accurately than the short[65536] lookup table that GGML previously used to make this operation go faster. We can support aarch64 and sse2+ with the worst case rounding error of 2 ulp. I wrote avx2 and avx512 implementations as well but they didn't offer much advantage compared to sse2+fma to be worth the code complexity.

1 parent f98eb31 commit e767397Copy full SHA for e767397

1 file changed

+157

-193

lines changed

ggml.c

1 file changed

+157

-193

lines changed

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit e767397

1 file changed

1 file changed

File tree

1 file changed

1 file changed

0 commit comments