Skip to content

Conversation

mateusztabaka
Copy link
Contributor

@mateusztabaka mateusztabaka commented Feb 10, 2021

When the second input is not a constant, let's shuffle it with Split followed by Concat.
There are examples of models, where this non-constant input gets constant folded anyway by a framework.
Even if that's not the case, Split+Concat of 8 (or 10) element tensor should be a good trade for Transpose pair.

Signed-off-by: Mateusz Tabaka [email protected]

When the second input is not a constant, let's shuffle it with Split followed by Concat.
There are examples of models, where this non-constant input gets constant folded anyway by a framework.
Even if that's not the case. Split+Concat of 8 (or 10) element tensor should be a good trade for Transpose pair.

Signed-off-by: Mateusz Tabaka <[email protected]>
@mateusztabaka
Copy link
Contributor Author

@TomWildenhain-Microsoft this one is ready for review. Thanks.

# when the second input is not a constant, let's shuffle it with Split followed by Concat
# there are examples of models, where this non-constant input
# gets constant folded anyway by a framework.
split = self._g.make_node("Split", inputs=[node.input[1]], attr={}, output_count=trans_rank * 2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. When the pad is non-const, it is often constructed by concating some tensors together. This optimization might introduce a concat -> split -> concat sequence that can be further optimized. Still, it is worth it for removing a transpose and the concat -> split -> concat will be operating on very small tensors. We might want to add a new optimizer for it in the future though.

@TomWildenhain-Microsoft TomWildenhain-Microsoft merged commit e3bb930 into onnx:master Feb 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants