forked from microsoft/onnxruntime
-
Notifications
You must be signed in to change notification settings - Fork 56
Sync With latest msft commits #600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### Description A new interface for interaction between ONNX Runtime and Vitis AI has been added, which uses `std::filesystem::path` to pass paths. ### Motivation and Context Vitis AI uses `std::string` to pass paths, which causes errors on Windows when the model name contains Chinese characters. Therefore, this PR adds an interface that uses `std::filesystem::path` to pass paths, ensuring that file paths are correctly transmitted. Co-authored-by: genmingz <[email protected]>
### Description 1. Re-enable UTs which passed 2.30 2. Update resize UT because "round_prefer_floor" is no longer supported in QNN SDK since 2.21. ### Motivation and Context 1. Make the UT of QNN EP pass as much as possible to improve the test coverage. --------- Co-authored-by: Kuan-Yu Lin <[email protected]>
### Description Make[Nuget Publishing](https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1313&_a=summary) 1ES compliant ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
Description Upgrading RN to 0.73.11, including Android and iOS changes.. This PR also include the E2E test changes. Used React-Native upgrade [helper](https://react-native-community.github.io/upgrade-helper/?from=0.72.11&to=0.73.11&package=onnxruntime-android&name=onnxruntime) as the reference. Motivation and Context Need newer RN version to fix S360 work items.
### Description Make [Nuget CUDA 12 Publish Pipeline](https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1312&_a=summary) 1ES compliant ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
### Description Original UT use random seed. Change to fixed seed. ### Motivation and Context Fix flaky UT.
### Description <!-- Describe your changes. --> * Update oss parser version to latest commit of 10.8-GA branch ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> * Action needed to adapt latest onnx-tensorrt 10.8-GA branch to fix scatterND attribute issue and `plugin.h` not found issue
### Description This commit fixes alignment issues in shader code. ### Motivation and Context See above.
### Description Upgrade EMSDK to 4.0.4 ### Motivation and Context Emscripten v4.0.4 brings 2 useful changes that are helpful for webgpu: - emscripten-core/emscripten#23678 - emscripten-core/emscripten#23631
### Description This PR enables Contrib Ops support in OVEP namely below - DynamicQuantizeMatMul, FusedMatMul, QuickGelu, SkipSimplifiedLayerNormalization Co-authored-by: n1harika <[email protected]>
…icrosoft#23829) ### Description To resolve microsoft#23821 ### Motivation and Context Similar to microsoft#23641 .
… testing (microsoft#23801) Summary of changes: - Changed openVINO test case to use --enable_generic_interface - changed tensorRT test case to use --enable_generic_interface - Fixed ORT builds to USE_FULL_PROTOBUF as openVINO/TensorRT requires them - Fixed pre-processor macro definition which accidently got removed when ORT is build w/o EP ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Co-authored-by: Karim Vadsariya <[email protected]>
…icrosoft#23825) ### Description Increase [npm package pipeline](https://aiinfra.visualstudio.com/Lotus/_build?definitionId=1080&_a=summary) ReactNative_CI_iOS timeout to 120 mins ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
### Description In GemmBatch, target matrix is cut into blocks to dispatch to multiple threads for intra-op parallelism. Currently the block size hard-coded to 16. If the CPU has > 16 cores, cores are not fully utilized in one op. This change unblocks the number of blocks in various MatMul. __Benchmark results__ Model: llmlingua-2-bert-base-multilingual-cased-meetingbank--add-force-token-100--max-seq-len-512-CPU-INT8.onnx set up: 96 core x86 linux Before: Setting intra_op_num_threads to 64 Overriding dimension with name, batch_size, to 3 Session creation time cost: 0.485097 s First inference time cost: 356 ms Total inference time cost: 17.731 s Total inference requests: 50 __Average inference time cost: 354.619 ms__ Total inference run time: 17.7312 s Number of inferences per second: 2.81989 Avg CPU usage: 65 % Peak working set size: 542265344 bytes Avg CPU usage:65 Peak working set size:542265344 After: Setting intra_op_num_threads to 32 Overriding dimension with name, batch_size, to 3 Session creation time cost: 0.523394 s First inference time cost: 316 ms Total inference time cost: 12.2739 s Total inference requests: 50 __Average inference time cost: 245.478 ms__ Total inference run time: 12.2741 s Number of inferences per second: 4.07362 Avg CPU usage: 33 % Peak working set size: 611241984 bytes Avg CPU usage:33 Peak working set size:611241984 Setting intra_op_num_threads to 64 Overriding dimension with name, batch_size, to 3 Session creation time cost: 0.497698 s First inference time cost: 289 ms Total inference time cost: 9.49205 s Total inference requests: 50 __Average inference time cost: 189.841 ms__ Total inference run time: 9.49226 s Number of inferences per second: 5.26745 Avg CPU usage: 65 % Peak working set size: 548470784 bytes Avg CPU usage:65 Peak working set size:548470784 Runs:50 ### Motivation and Context This issue is reported by M365 research team.
### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
ankitm3k
approved these changes
Feb 28, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Sync With latest msft commits