Skip to content

Conversation

@arthw
Copy link
Contributor

@arthw arthw commented May 21, 2024

Fix the ggml_sycl_mul_mat_id() which is impacted by the api parameters changed by #6505.
Now, only 1 mul_mat_id UT case is fault: type_a=iq4_nl. It's known issue.

@github-actions github-actions bot added the SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language label May 21, 2024
@arthw arthw requested a review from airMeng May 21, 2024 13:59
@mofosyne mofosyne added the Review Complexity : High Generally require indepth knowledge of LLMs or GPUs label May 21, 2024
Copy link
Contributor

@AidanBeltonS AidanBeltonS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor things which I think should be changed.

I have also tested this branch on an Arc A770 and A100 GPU and can confirm the MUL_MAT_ID implementation is working

@AidanBeltonS
Copy link
Contributor

@arthw could you rebase this PR? I believe it will resolve the failing CI checks.

@arthw arthw force-pushed the fix_mul_mat_id branch from 32ba8d3 to e0a686b Compare May 28, 2024 02:51
@NeoZhangJianyu
Copy link
Collaborator

@arthw could you rebase this PR? I believe it will resolve the failing CI checks.

yes, rebase it.

@AidanBeltonS AidanBeltonS merged commit e2b0650 into ggml-org:master May 28, 2024
@slaren
Copy link
Member

slaren commented Jun 19, 2024

In llm_load_tensors, there is still this code that will prevent using MoE models with SYCL:

https://github.com/ggerganov/llama.cpp/blob/9c77ec1d74874ee22bdef8f110e8e8d41389abf2/llama.cpp#L5283-L5288

It was added in #6505 as a workaround to avoid crashing SYCL with MoE models, but if mul_mat_id was fixed here, it should be removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Review Complexity : High Generally require indepth knowledge of LLMs or GPUs SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants