Skip to content

Commit 6ba88ff

Browse files
kunal-vaishnaviankitm3k
authored andcommitted
Update attention fusion in speech component of Phi-4 mm (microsoft#24513)
### Description This PR updates how the K path is identified in Phi-4 multimodal. ### Motivation and Context This is needed as part of the updates made to the rewritten modeling code for the speech component of Phi-4 multimodal.
1 parent f167714 commit 6ba88ff

File tree

1 file changed

+8
-3
lines changed

1 file changed

+8
-3
lines changed

onnxruntime/python/tools/transformers/fusion_conformer_attention.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -126,8 +126,14 @@ def fuse(self, normalize_node, input_name_to_nodes, output_name_to_node):
126126
[1, 0, 0, 0, 0],
127127
)
128128
if k_nodes is None:
129-
logger.debug("fuse_conformer_attention: failed to match k path")
130-
return
129+
k_nodes = self.model.match_parent_path(
130+
matmul_qk,
131+
["Transpose", "Reshape", "Add", "MatMul"],
132+
[1, 0, 0, 0],
133+
)
134+
if k_nodes is None:
135+
logger.debug("fuse_conformer_attention: failed to match k path")
136+
return
131137
else:
132138
concat_k = k_nodes[1]
133139
concat_parent = self.model.get_parent(concat_k, 0, None)
@@ -188,7 +194,6 @@ def fuse(self, normalize_node, input_name_to_nodes, output_name_to_node):
188194
logger.debug("fuse_conformer_attention: MultiHeadAttention node creation failed")
189195
return
190196

191-
self.increase_counter(new_node.op_type)
192197
self.nodes_to_add.append(new_node)
193198
self.node_name_to_graph_name[new_node.name] = self.this_graph_name
194199

0 commit comments

Comments
 (0)