Commit 789945c

authored and

committed

ggml-cuda : perform cublas mat mul of quantized types as f16 (ggml-org#3412)

* ggml-cuda : perform cublas matrix multiplication of quantized types as fp16 * rename CC_TURING to CC_VOLTA * disable fp16 mat mul completely with multi GPU

1 parent 0d04abb commit 789945cCopy full SHA for 789945c

1 file changed

+122

-72

lines changed

ggml-cuda.cu

1 file changed

+122

-72

lines changed

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit 789945c

1 file changed

1 file changed

File tree

1 file changed

1 file changed

0 commit comments