Skip to content

Commit 789945c

Browse files
slarenyusiwen
authored andcommitted
ggml-cuda : perform cublas mat mul of quantized types as f16 (ggml-org#3412)
* ggml-cuda : perform cublas matrix multiplication of quantized types as fp16 * rename CC_TURING to CC_VOLTA * disable fp16 mat mul completely with multi GPU
1 parent 0d04abb commit 789945c

File tree

1 file changed

+122
-72
lines changed

1 file changed

+122
-72
lines changed

0 commit comments

Comments
 (0)