-
Notifications
You must be signed in to change notification settings - Fork 64
Extend optimize_for_ort to cover passes #2274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Extend optimize_for_ort to cover passes #2274
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2274 +/- ##
==========================================
- Coverage 73.78% 73.77% -0.01%
==========================================
Files 235 235
Lines 30936 30939 +3
Branches 3494 3494
==========================================
Hits 22825 22825
- Misses 6911 6914 +3
Partials 1200 1200 ☔ View full report in Codecov by Sentry. |
Please also consider whether this method should be optimize in-place or not. I think we can make it in-place now that shape-inference itself is in-place. |
# Apply the ORT pattern rewrite rules. | ||
rewrite(model, ORT_PATTERN_REWRITE_RULES) | ||
|
||
# TODO(exporter team): Fold transpose into initializers | ||
# Apply the ORT optimization passes. | ||
# https://github.com/microsoft/onnxruntime/blob/74dcf7e296639095dfa55d31336998b6f719ed76/onnxruntime/python/tools/transformers/dynamo_onnx_helper.py#L172 | ||
common_passes.ClearMetadataAndDocStringPass()(model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may put all the passes into a pass manager like we do in optimize()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
I think making it out-of-place is safer, in case we have passes in the future that need to be functional? |
# https://github.com/microsoft/onnxruntime/blob/74dcf7e296639095dfa55d31336998b6f719ed76/onnxruntime/python/tools/transformers/dynamo_onnx_helper.py#L172 | ||
common_passes.ClearMetadataAndDocStringPass(), | ||
# https://github.com/microsoft/onnxruntime/blob/74dcf7e296639095dfa55d31336998b6f719ed76/onnxruntime/python/tools/transformers/dynamo_onnx_helper.py#L139 | ||
common_passes.LiftConstantsToInitializersPass(lift_all_constants=False, size_limit=1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have another pass called LiftSubgraphInitializersToMainGraphPass
. Do we know if it's needed in genAI? @kunal-vaishnavi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the pass logic is in DynamoOnnxHelper
, then it is used for ONNX Runtime GenAI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't really produce graphs with subgraph initializers. I think we are ok either way
Fix #2261
A draft for discussion. We should cover all post-processing the model shipping needs