Fix transformers modeling code passing `input.shape[0] == 0` to nn.module #1365

Qubitium · 2025-03-02T09:18:35Z

Fix upstream transformers modeling inference code is passing impossible input shape where shape[0]==0 to module

Patch Fixes: #1361

However, I believe this should not happen at all, has huge performance implications as the input would result in a no-op with wasted zero tensor allocations, and fix should be upstream in transformers.

@SunMarc @MekkCyber

Please check tests/models/test_test_qwen_15_moe.py for reproduction without this PR fix. The offending module is gate_proj in the Qwen MoE model.

input = tensor([], device='cuda:0', size=(0, 2048), dtype=torch.float16)

test_qwen_15_moe.py:30: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../gptqmodel/models/base.py:1142: in generate
    return self.model.generate(inputs=inputs, **kwargs)
/root/miniconda3/lib/python3.12/site-packages/torch/utils/_contextlib.py:116: in decorate_context
    return func(*args, **kwargs)
/root/miniconda3/lib/python3.12/site-packages/transformers/generation/utils.py:2223: in generate
    result = self._sample(
/root/miniconda3/lib/python3.12/site-packages/transformers/generation/utils.py:3211: in _sample
    outputs = self(**model_inputs, return_dict=True)
/root/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py:1739: in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
/root/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py:1750: in _call_impl
    return forward_call(*args, **kwargs)
/root/miniconda3/lib/python3.12/site-packages/transformers/utils/deprecation.py:172: in wrapped_func
    return func(*args, **kwargs)
/root/miniconda3/lib/python3.12/site-packages/transformers/models/qwen2_moe/modeling_qwen2_moe.py:1317: in forward
    outputs = self.model(
/root/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py:1739: in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
/root/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py:1750: in _call_impl
    return forward_call(*args, **kwargs)
/root/miniconda3/lib/python3.12/site-packages/transformers/models/qwen2_moe/modeling_qwen2_moe.py:1017: in forward
    layer_outputs = decoder_layer(
/root/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py:1739: in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
/root/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py:1750: in _call_impl
    return forward_call(*args, **kwargs)
/root/miniconda3/lib/python3.12/site-packages/transformers/models/qwen2_moe/modeling_qwen2_moe.py:745: in forward
    hidden_states = self.mlp(hidden_states)
/root/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py:1739: in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
/root/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py:1750: in _call_impl
    return forward_call(*args, **kwargs)
/root/miniconda3/lib/python3.12/site-packages/transformers/models/qwen2_moe/modeling_qwen2_moe.py:654: in forward
    current_hidden_states = expert_layer(current_state) * routing_weights[top_x, idx, None]
/root/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py:1739: in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
/root/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py:1750: in _call_impl
    return forward_call(*args, **kwargs)
/root/miniconda3/lib/python3.12/site-packages/transformers/models/qwen2_moe/modeling_qwen2_moe.py:280: in forward
    return self.down_proj(self.act_fn(self.gate_proj(x)) * self.up_proj(x))
/root/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py:1739: in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
/root/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py:1750: in _call_impl
    return forward_call(*args, **kwargs)
../../gptqmodel/nn_modules/qlinear/marlin.py:412: in forward
    out = apply_gptq_marlin_linear(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

input = tensor([], device='cuda:0', size=(0, 2048), dtype=torch.float16)

…le input shape where `shape[0]==0` to module Signed-off-by: Qubitium <[email protected]>

Fix upstream transformers modeling inference code is passing impossib…

1e27824

…le input shape where `shape[0]==0` to module Signed-off-by: Qubitium <[email protected]>

Qubitium changed the title ~~Fix transformers inference code is passing impossible shape to kernel~~ Fix transformers modeling inference is passing impossible shape to kernel Mar 2, 2025

Qubitium changed the title ~~Fix transformers modeling inference is passing impossible shape to kernel~~ Fix transformers modeling inference is passing impossible shape to nn.module Mar 2, 2025

Qubitium changed the title ~~Fix transformers modeling inference is passing impossible shape to nn.module~~ Fix transformers modeling code passing input.shape[0] == 0 to nn.module Mar 2, 2025

Qubitium merged commit 0a0cfb0 into main Mar 2, 2025
4 checks passed

Qubitium deleted the fix-impossible-input-shape branch March 2, 2025 09:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix transformers modeling code passing `input.shape[0] == 0` to nn.module #1365

Fix transformers modeling code passing `input.shape[0] == 0` to nn.module #1365

Uh oh!

Qubitium commented Mar 2, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix transformers modeling code passing input.shape[0] == 0 to nn.module #1365

Fix transformers modeling code passing input.shape[0] == 0 to nn.module #1365

Uh oh!

Conversation

Qubitium commented Mar 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix transformers modeling code passing `input.shape[0] == 0` to nn.module #1365

Fix transformers modeling code passing `input.shape[0] == 0` to nn.module #1365

Qubitium commented Mar 2, 2025 •

edited

Loading