You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Because we are switching from the naive MOE (`nn.ModuleList` for experts) we currently have an issue with MoEs that have adapters. For more details see https://github.com/huggingface/transformers/issues/42491#issuecomment-3591485649.
321
322
323
+
_We aim for this to be fixed and released in a following release candidate in the week that follows RC0._
324
+
325
+
### Tensor parallel and Expert parallel + MoE
326
+
327
+
We are streamlining the MoE support with vLLM; while this is being implemented, tensor parallelism and expert parallelism aren't working as expected.
328
+
This is known and actively being worked on.
329
+
330
+
_We aim for this to be fixed and released in a following release candidate in the week that follows RC0._
331
+
322
332
### Custom pretrained models:
323
333
For anyone inheriting from a `transformers``PreTrainedModel`, the weights are automatically initialized with the common scheme:
0 commit comments