Skip to content

Vulkan implementation (via Kompute) #2039

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 43 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
4f598dd
Initial working stuff
niansa Jun 22, 2023
2f3fe0c
Updated gitignore
niansa Jun 22, 2023
3b3d30e
Cleanups
niansa Jun 22, 2023
b0f11fa
More code cleanups
niansa Jun 22, 2023
9cdaea9
Implemented dequantize_row_q4_1
niansa Jun 22, 2023
339bc36
Added more functions from Metal
niansa Jun 23, 2023
9d64375
Fixed compile error
niansa Jun 23, 2023
b8a4594
More fixes...
niansa Jun 23, 2023
d539247
Began implementing ggml_graph_compute
niansa Jun 23, 2023
18d6f7f
More progress...
niansa Jun 23, 2023
b626454
Added vk_mul to ggml_vk_graph_compute
niansa Jun 23, 2023
5e94033
Minor fixes
niansa Jun 23, 2023
e830264
Share sequence to functions and add scale()
niansa Jun 23, 2023
5c0d8dd
Specify program output size
niansa Jun 23, 2023
2589cb0
Prevent compileSource race
niansa Jun 23, 2023
09b0b3a
Wait for all threads to finish
niansa Jun 23, 2023
98e588c
Fix ggml_vk_h2d_tensor throwing on second call
niansa Jun 23, 2023
46f577b
h2d tensors during loadup
niansa Jun 23, 2023
1a68195
Add mutexes for gpu tensors
niansa Jun 23, 2023
e6da9bd
Added ggml_vk_mem_used()
niansa Jun 23, 2023
40621ea
Added more debugging
niansa Jun 23, 2023
4b267e8
Temporarily care for all layers
niansa Jun 23, 2023
55815b6
Improved memory safety
niansa Jun 23, 2023
e0814f8
Free vk context
niansa Jun 23, 2023
5d5f66d
More little fixes and stuff
niansa Jun 23, 2023
acb7d90
Reenabled unknown op message
niansa Jun 23, 2023
072007b
Add buffer qualifiers
niansa Jun 23, 2023
ed14f07
Fixed ggml_vk_abmath row argument
niansa Jun 28, 2023
e2b721d
Allow vk add row
niansa Jun 28, 2023
de7d182
Implemented ggml_vk_soft_max
niansa Jun 28, 2023
5ac68cc
Cleanups
niansa Jun 29, 2023
749d617
Snake case all functions
niansa Jun 29, 2023
964fe8c
Added mul_mat (needs fixes)
niansa Jun 30, 2023
f093bf2
Minor MUL_MAT fix and implemented DIAG_MASK_INF
niansa Jun 30, 2023
0dc5f2f
Fixed mul mat dispatch size
niansa Jun 30, 2023
8fa6013
Added missing break to mul_mat_f16 case
niansa Jun 30, 2023
d1f84db
Implemented GGML_OP_NORM
niansa Jun 30, 2023
f0e1429
Implemented RMS_NORM
niansa Jun 30, 2023
2fc8249
Simple mul_mat_f16 for speed and removal of unused mul_mat_f32
niansa Jul 5, 2023
6be93e6
Ported mat mul from Metal
niansa Jul 5, 2023
856b758
Optimized ggml_vk_mul_mat_f16 argument count
niansa Jul 5, 2023
77ebe46
Fixed case order in ggml_vk_graph_compute
niansa Jul 5, 2023
44d214c
Only warn if __STDC_IEC_559__ isn't defined
niansa Jul 5, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -56,3 +56,5 @@ qnt-*.txt
perf-*.txt

examples/jeopardy/results.txt

CMakeLists.txt.user*
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "kompute"]
path = kompute
url = https://github.com/KomputeProject/kompute.git
18 changes: 18 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ set(LLAMA_CUDA_DMMV_Y "1" CACHE STRING "llama: y block size for dmmv CUDA
option(LLAMA_CUDA_DMMV_F16 "llama: use 16 bit floats for dmmv CUDA kernels" OFF)
set(LLAMA_CUDA_KQUANTS_ITER "2" CACHE STRING "llama: iters./thread per block for Q2_K/Q6_K")
option(LLAMA_CLBLAST "llama: use CLBlast" OFF)
option(LLAMA_KOMPUTE "llama: use Kompute" OFF)
option(LLAMA_METAL "llama: use Metal" OFF)
option(LLAMA_K_QUANTS "llama: use k-quants" ON)

Expand Down Expand Up @@ -309,6 +310,22 @@ if (LLAMA_CLBLAST)
endif()
endif()

if (LLAMA_KOMPUTE)
if (EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/kompute/CMakeLists.txt")
message(STATUS "Kompute found")

add_subdirectory(kompute)

set(GGML_SOURCES_KOMPUTE ggml-vulkan.cpp ggml-vulkan.h)

add_compile_definitions(GGML_USE_KOMPUTE)

set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} kompute)
else()
message(WARNING "Kompute not found")
endif()
endif()

if (LLAMA_ALL_WARNINGS)
if (NOT MSVC)
set(c_flags
Expand Down Expand Up @@ -466,6 +483,7 @@ add_library(ggml OBJECT
ggml.h
${GGML_SOURCES_CUDA}
${GGML_SOURCES_OPENCL}
${GGML_SOURCES_KOMPUTE}
${GGML_SOURCES_METAL}
${GGML_SOURCES_EXTRA}
)
Expand Down
Loading