vulkan: Additional type support for unary, binary, and copy #13266

jeffbolznv · 2025-05-02T15:07:25Z

Support f16->f32 copy.
Support f16->f16 and f32->f32 unary ops.
Support all combinations of f16/f32 for src0/src1/dst for add/sub/mul/div.

Another engineer at NVIDIA was writing some code using the ggml api, and quickly ran into some cases we didn't support in the Vulkan backend. He wanted to do an add with types f32=f32+f16. When that didn't work, trying to convert from f16->f32 also didn't work due to a missing copy variant. And relu with f16 was also unsupported.

Support f16->f32 copy. Support f16->f16 and f32->f32 unary ops. Support all combinations of f16/f32 for src0/src1/dst for add/sub/mul/div.

0cc4m

LGTM

* origin/master: (27 commits) llama : fix build_ffn without gate (ggml-org#13336) CUDA: fix bad asserts for partial offload (ggml-org#13337) convert : qwen2/3moe : set yarn metadata if present (ggml-org#13331) CUDA: fix --split-mode row for MMQ (ggml-org#13323) gguf-py : avoid requiring pyside6 for other scripts (ggml-org#13036) CUDA: fix logic for clearing padding with -ngl 0 (ggml-org#13320) sampling : Integrate Top-nσ into main sampling chain (and add it to the server) (ggml-org#13264) server : Webui - change setText command from parent window to also send the message. (ggml-org#13309) mtmd : rename llava directory to mtmd (ggml-org#13311) clip : fix confused naming ffn_up and ffn_down (ggml-org#13290) convert : bailingmoe : set yarn metadata if present (ggml-org#13312) SYCL: Disable mul_mat kernels for noncontiguous tensor b (ggml-org#13308) mtmd : add C public API (ggml-org#13184) rpc : use backend registry, support dl backends (ggml-org#13304) ggml : activate s390x simd for Q3_K (ggml-org#13301) llava/mtmd : fixes to fully support dl backends (ggml-org#13303) llama : build windows releases with dl backends (ggml-org#13220) CUDA: fix race condition in MMQ stream-k fixup (ggml-org#13299) CUDA: fix race condition in MMQ ids_dst (ggml-org#13294) vulkan: Additional type support for unary, binary, and copy (ggml-org#13266) ...

vulkan: Additional type support for unary, binary, and copy

2749ef5

Support f16->f32 copy. Support f16->f16 and f32->f32 unary ops. Support all combinations of f16/f32 for src0/src1/dst for add/sub/mul/div.

jeffbolznv requested a review from 0cc4m May 2, 2025 15:07

github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels May 2, 2025

0cc4m approved these changes May 4, 2025

View reviewed changes

0cc4m merged commit 8ae5ebc into ggml-org:master May 4, 2025
51 checks passed

joesixpaq mentioned this pull request May 4, 2025

Eval bug: Can't run Qwen3-32B Q4_K_XL #13298

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vulkan: Additional type support for unary, binary, and copy #13266

vulkan: Additional type support for unary, binary, and copy #13266

Uh oh!

jeffbolznv commented May 2, 2025

Uh oh!

0cc4m left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vulkan: Additional type support for unary, binary, and copy #13266

vulkan: Additional type support for unary, binary, and copy #13266

Uh oh!

Conversation

jeffbolznv commented May 2, 2025

Uh oh!

0cc4m left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants