-
-
Notifications
You must be signed in to change notification settings - Fork 10.3k
Mixtral 8x7B support #2011
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mixtral 8x7B support #2011
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you for your contribution and the official support of vLLM from Mistral AI!!
hidden_states: Optional[torch.Tensor], | ||
sampling_metadata: SamplingMetadata, | ||
) -> SamplerOutput: | ||
hidden_states = self.norm(hidden_states) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can we do this in forward
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I merged the PR asap from my phone to let people use the model. I think one other small thing is to add mixtral to the supported model list. I am AFK now, can you help fix this if possible?
I installed from latest main, installed stk, megablocks, latest flash_attn, transformers etc...
|
@draganjovanovich Thanks for reporting the error! Please re-install megablocks with |
Np, I tried installing
Failed to build grouped_gemm Now, I created new env, and start from scratch. I will comment if success. |
Hi, it looks like the errors you're getting are from megablocks. Can you share more details on your environment? The latest issue looks like it may be because of the CUDA toolkit version you're using (maybe lets start a separate issue?). |
I tried to install Megablocks and I got this error CalledProcessError: Command 'pip --disable-pip-version-check install git+https://github.com/stanford-futuredata/[email protected]' returned non-zero exit status 1. |
Co-authored-by: Pierre Stock <[email protected]> Co-authored-by: Zhuohan Li <[email protected]>
Adding support for
mistralai/Mixtral-8x7B-v0.1
andmistralai/Mixtral-8x7B-Instruct-v0.1
models as described in our blogpost.This is joint work between @zhuohan123, @WoosukKwon from the vLLM project and Mistral AI.
It integrates fast sparse mixture of experts kernels from the Megablocks project.