ggml : revert CUDA broadcast changes from #2183 #2191

ggerganov · 2023-07-12T07:51:52Z

The recent changes from ggml-org/ggml#359 and ggml-org/ggml#373 break the inference in llama.cpp. Should reimplement them here and make sure everything works. After that we can upstream back to ggml

li-plus · 2023-07-13T03:32:16Z

@ggerganov Fixed in #2192. Would you take a look at it. Thanks.

ggml : revert CUDA broadcast changes from #2183

95fff69

ggerganov merged commit f7d278f into master Jul 12, 2023

ggerganov deleted the fix-cuda-bcast branch July 12, 2023 07:54

ggerganov mentioned this pull request Jul 12, 2023

Hotfix for the prompt being ignored with CUDA #2190

Closed

li-plus mentioned this pull request Jul 12, 2023

Support broadcast add & mul on CUDA (fixed) #2192

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ggml : revert CUDA broadcast changes from #2183 #2191

ggml : revert CUDA broadcast changes from #2183 #2191

Uh oh!

ggerganov commented Jul 12, 2023

Uh oh!

li-plus commented Jul 13, 2023

Uh oh!

Uh oh!

ggml : revert CUDA broadcast changes from #2183 #2191

ggml : revert CUDA broadcast changes from #2183 #2191

Uh oh!

Conversation

ggerganov commented Jul 12, 2023

Uh oh!

li-plus commented Jul 13, 2023

Uh oh!

Uh oh!