-
Notifications
You must be signed in to change notification settings - Fork 11.9k
musa: workaround for Guilty Lockup in cleaning src0 in #10032 #10042
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
musa: workaround for Guilty Lockup in cleaning src0 in #10032 #10042
Conversation
Signed-off-by: Xiaodong Ye <[email protected]>
1001967
to
862b959
Compare
Could you please review this PR? I know the code looks ugly, but it works for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you're aware that this results in broken K cache quantization.
Yes. After reviewing all contexts, this approach appears to be the only viable solution to avoid a crash. Thanks for approving this! |
Signed-off-by: Xiaodong Ye <[email protected]>
Signed-off-by: Xiaodong Ye <[email protected]>
Signed-off-by: Xiaodong Ye <[email protected]>
Signed-off-by: Xiaodong Ye <[email protected]>
Signed-off-by: Xiaodong Ye <[email protected]>
* musa: Update MUSA SDK version to rc3.1.1 Signed-off-by: Xiaodong Ye <[email protected]> * musa: Remove workaround in PR #10042 Signed-off-by: Xiaodong Ye <[email protected]> --------- Signed-off-by: Xiaodong Ye <[email protected]>
* musa: Update MUSA SDK version to rc3.1.1 Signed-off-by: Xiaodong Ye <[email protected]> * musa: Remove workaround in PR ggml-org#10042 Signed-off-by: Xiaodong Ye <[email protected]> --------- Signed-off-by: Xiaodong Ye <[email protected]>
* musa: Update MUSA SDK version to rc3.1.1 Signed-off-by: Xiaodong Ye <[email protected]> * musa: Remove workaround in PR ggml-org#10042 Signed-off-by: Xiaodong Ye <[email protected]> --------- Signed-off-by: Xiaodong Ye <[email protected]>
* musa: Update MUSA SDK version to rc3.1.1 Signed-off-by: Xiaodong Ye <[email protected]> * musa: Remove workaround in PR ggml-org#10042 Signed-off-by: Xiaodong Ye <[email protected]> --------- Signed-off-by: Xiaodong Ye <[email protected]>
We’re encountering an MTGPU Guilty Lockup issue during the model warm-up stage after merging #10032. This PR reverts this change for MUSA only.
I've raised an internal issue and will remove this workaround once it has been resolved.