Skip to content

Commit e4ff3dc

Browse files
committed
iq1s_blocks16: uint32_t codebook is also better in CUDA
TG-128 is now 204 t/s up from 194 t/s. PP-512 is 5890 t/s, so significantly better than other quants
1 parent d090b0c commit e4ff3dc

File tree

1 file changed

+278
-525
lines changed

1 file changed

+278
-525
lines changed

0 commit comments

Comments
 (0)