Commit ad6b898
Fix broken Llama4 accuracy in MoE part (#40609)
* Fix broken Llama4 accuracy in MoE part
Llama4 accuracy is broken by a bug in
#39501 . It forgot to
transpose the router_scores before applying it to routed_in, causing
Llama4 to generate garbage output.
This PR fixes that issue by adding back the transpose() and adding some
comments explaining why the transpose() is needed.
Signed-off-by: Po-Han Huang <[email protected]>
* remove comment
---------
Signed-off-by: Po-Han Huang <[email protected]>
Co-authored-by: Cyril Vallez <[email protected]>1 parent e7d351c commit ad6b898
1 file changed
+1
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
158 | 158 | | |
159 | 159 | | |
160 | 160 | | |
161 | | - | |
| 161 | + | |
162 | 162 | | |
163 | 163 | | |
164 | 164 | | |
| |||
0 commit comments