Skip to content

Commit 3aefd9f

Browse files
authored
[Bugfix] Compilation Error in q4f32_1 (mlc-ai#1078)
The pass `fuse-split-rotary` assumes the compute dtype is fp16, which usually is, but in certain cases, e.g. `q0f32` and `q4f32_1`, the compute is based on fp32 instead. This PR strengthens the check guard.
1 parent 9872c48 commit 3aefd9f

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

mlc_llm/core.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -405,6 +405,7 @@ def mod_transform_before_build(
405405
hasattr(config, "num_attention_heads")
406406
and hasattr(config, "hidden_size")
407407
and hasattr(config, "position_embedding_base")
408+
and getattr(config, "dtype", "float16") == "float16"
408409
):
409410
max_seq_len = None
410411
if args.max_seq_len > 0:

0 commit comments

Comments
 (0)