Commit dc68f00

authored

cuda : fix vmm pool with multi GPU (ggml-org#4620)

* cuda : fix vmm pool with multi GPU * hip * use recommended granularity instead of minimum * better error checking * fix mixtral * use cudaMemcpy3DPeerAsync * use cuda_pool_alloc in ggml_cuda_op_mul_mat * consolidate error checking in ggml_cuda_set_device * remove unnecessary inlines ggml-ci * style fixes * only use vmm for the main device * fix scratch buffer size, re-enable vmm pool for all devices * remove unnecessary check id != g_main_device

1 parent de8e496 commit dc68f00Copy full SHA for dc68f00

3 files changed

+243

-246

lines changed

ggml-cuda.cu
ggml.c
llama.cpp

3 files changed

+243

-246

lines changed

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit dc68f00

3 files changed

3 files changed

File tree

3 files changed

3 files changed

0 commit comments