CUDA: add conv_2d_transpose #14287

am17an · 2025-06-19T16:35:12Z

Adding a conv2d_transpose kernel which has feature parity with the CPU implementation except that it supports batches. Padding should be trivial to add, but I didn't add it since the CPU version doesn't have it. I also added correctness and performance test cases

Backend	Device	us/run	Bandwidth	Speedup
CPU	Ryzen 3800XT 8-core	144 491.81	0.46 GB/s	1.00
GPU	RTX 3090	11 759.66	5.67 GB/s	12.28

ggml/src/ggml-cuda/conv2d-transpose.cu

tests/test-backend-ops.cpp

…sserts

* mamba2-sync: (24 commits) sync : ggml Add `ggml_roll` (ggml/1274) docs : fix the link to llama.h (ggml-org#14293) CUDA: add conv_2d_transpose (ggml-org#14287) lint : remove trailing whitepace (ggml-org#14304) vocab : prevent tokenizer overflow (ggml-org#14301) sycl: add usage of enqueue_functions extension (ggml-org#14244) Implement GGML_CPU_ALL_VARIANTS for PowerPC (ggml-org#14286) llama : improve sep token handling (ggml-org#14272) cuda : synchronize graph capture and cublas handle destruction (ggml-org#14288) ggml : fix repack work size for mul_mat_id (ggml-org#14292) ggml: Update KleidiAI to v1.9.0 (ggml-org#14277) model : more uniform output id handling (ggml-org#14275) ubatch : new splitting logic (ggml-org#14217) CUDA: add conv_2d_dw (ggml-org#14265) ggml-cpu : remove unnecesary arm feature detection (ggml-org#14281) gguf-py : make sentencepiece optional (ggml-org#14200) server : add server parameters for draft model cache type (ggml-org#13782) build : suppress gcc15 compile warnings (ggml-org#14261) sycl: Cleanup codepaths in Get Rows in sycl backend (ggml-org#14215) ...

github-actions bot added testing Everything test related Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Jun 19, 2025

CUDA: add conv_2d_transpose

22d078a

am17an force-pushed the add_conv2d_transpose branch from f9d7ccd to da2f437 Compare June 20, 2025 01:52

remove direct include of cuda_fp16

b80dd1d

am17an force-pushed the add_conv2d_transpose branch from da2f437 to b80dd1d Compare June 20, 2025 01:58

am17an requested a review from JohannesGaessler June 20, 2025 05:10

JohannesGaessler reviewed Jun 20, 2025

View reviewed changes

ggml/src/ggml-cuda/conv2d-transpose.cu Outdated Show resolved Hide resolved

ggml/src/ggml-cuda/conv2d-transpose.cu Show resolved Hide resolved

tests/test-backend-ops.cpp Outdated Show resolved Hide resolved

tests/test-backend-ops.cpp Outdated Show resolved Hide resolved

Review: add brackets for readability, remove ggml_set_param and add a…

32c180e

…sserts

am17an requested a review from JohannesGaessler June 20, 2025 12:07

JohannesGaessler approved these changes Jun 20, 2025

View reviewed changes

am17an merged commit c959f46 into ggml-org:master Jun 20, 2025
47 checks passed

am17an deleted the add_conv2d_transpose branch June 20, 2025 14:49

am17an mentioned this pull request Jun 20, 2025

GGML_OP_CONV_TRANSPOSE_2D support cuda backend ggml-org/ggml#1277

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CUDA: add conv_2d_transpose #14287

CUDA: add conv_2d_transpose #14287

Uh oh!

am17an commented Jun 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CUDA: add conv_2d_transpose #14287

CUDA: add conv_2d_transpose #14287

Uh oh!

Conversation

am17an commented Jun 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!