Sync With latest msft commits #600

jatinwadhwa921 · 2025-02-28T12:58:30Z

Sync With latest msft commits

### Description A new interface for interaction between ONNX Runtime and Vitis AI has been added, which uses `std::filesystem::path` to pass paths. ### Motivation and Context Vitis AI uses `std::string` to pass paths, which causes errors on Windows when the model name contains Chinese characters. Therefore, this PR adds an interface that uses `std::filesystem::path` to pass paths, ensuring that file paths are correctly transmitted. Co-authored-by: genmingz <[email protected]>

### Description 1. Re-enable UTs which passed 2.30 2. Update resize UT because "round_prefer_floor" is no longer supported in QNN SDK since 2.21. ### Motivation and Context 1. Make the UT of QNN EP pass as much as possible to improve the test coverage. --------- Co-authored-by: Kuan-Yu Lin <[email protected]>

### Description Make[Nuget Publishing](https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1313&_a=summary) 1ES compliant ### Motivation and Context

### Description  ### Motivation and Context

Description Upgrading RN to 0.73.11, including Android and iOS changes.. This PR also include the E2E test changes. Used React-Native upgrade [helper](https://react-native-community.github.io/upgrade-helper/?from=0.72.11&to=0.73.11&package=onnxruntime-android&name=onnxruntime) as the reference. Motivation and Context Need newer RN version to fix S360 work items.

### Description Make [Nuget CUDA 12 Publish Pipeline](https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1312&_a=summary) 1ES compliant ### Motivation and Context

### Description Original UT use random seed. Change to fixed seed. ### Motivation and Context Fix flaky UT.

### Description  * Update oss parser version to latest commit of 10.8-GA branch ### Motivation and Context  * Action needed to adapt latest onnx-tensorrt 10.8-GA branch to fix scatterND attribute issue and `plugin.h` not found issue

### Description This commit fixes alignment issues in shader code. ### Motivation and Context See above.

### Description Upgrade EMSDK to 4.0.4 ### Motivation and Context Emscripten v4.0.4 brings 2 useful changes that are helpful for webgpu: - emscripten-core/emscripten#23678 - emscripten-core/emscripten#23631

### Description This PR enables Contrib Ops support in OVEP namely below - DynamicQuantizeMatMul, FusedMatMul, QuickGelu, SkipSimplifiedLayerNormalization Co-authored-by: n1harika <[email protected]>

…icrosoft#23829) ### Description To resolve microsoft#23821 ### Motivation and Context Similar to microsoft#23641 .

…#23779)

… testing (microsoft#23801) Summary of changes: - Changed openVINO test case to use --enable_generic_interface - changed tensorRT test case to use --enable_generic_interface - Fixed ORT builds to USE_FULL_PROTOBUF as openVINO/TensorRT requires them - Fixed pre-processor macro definition which accidently got removed when ORT is build w/o EP ### Description  ### Motivation and Context  Co-authored-by: Karim Vadsariya <[email protected]>

…icrosoft#23825) ### Description Increase [npm package pipeline](https://aiinfra.visualstudio.com/Lotus/_build?definitionId=1080&_a=summary) ReactNative_CI_iOS timeout to 120 mins ### Motivation and Context

### Description In GemmBatch, target matrix is cut into blocks to dispatch to multiple threads for intra-op parallelism. Currently the block size hard-coded to 16. If the CPU has > 16 cores, cores are not fully utilized in one op. This change unblocks the number of blocks in various MatMul. __Benchmark results__ Model: llmlingua-2-bert-base-multilingual-cased-meetingbank--add-force-token-100--max-seq-len-512-CPU-INT8.onnx set up: 96 core x86 linux Before: Setting intra_op_num_threads to 64 Overriding dimension with name, batch_size, to 3 Session creation time cost: 0.485097 s First inference time cost: 356 ms Total inference time cost: 17.731 s Total inference requests: 50 __Average inference time cost: 354.619 ms__ Total inference run time: 17.7312 s Number of inferences per second: 2.81989 Avg CPU usage: 65 % Peak working set size: 542265344 bytes Avg CPU usage:65 Peak working set size:542265344 After: Setting intra_op_num_threads to 32 Overriding dimension with name, batch_size, to 3 Session creation time cost: 0.523394 s First inference time cost: 316 ms Total inference time cost: 12.2739 s Total inference requests: 50 __Average inference time cost: 245.478 ms__ Total inference run time: 12.2741 s Number of inferences per second: 4.07362 Avg CPU usage: 33 % Peak working set size: 611241984 bytes Avg CPU usage:33 Peak working set size:611241984 Setting intra_op_num_threads to 64 Overriding dimension with name, batch_size, to 3 Session creation time cost: 0.497698 s First inference time cost: 289 ms Total inference time cost: 9.49205 s Total inference requests: 50 __Average inference time cost: 189.841 ms__ Total inference run time: 9.49226 s Number of inferences per second: 5.26745 Avg CPU usage: 65 % Peak working set size: 548470784 bytes Avg CPU usage:65 Peak working set size:548470784 Runs:50 ### Motivation and Context This issue is reported by M365 research team.

### Description  ### Motivation and Context

danyue333 and others added 19 commits February 25, 2025 13:33

Conveting npm packaging pipeline to 1ES (microsoft#23767)

9a2e009

### Description  ### Motivation and Context

[webgpu] support resize operator (microsoft#23780)

cc3f412

### Description  ### Motivation and Context

[ARM CPU] Fix flaky hgemmb ut (microsoft#23814)

7a3810d

### Description Original UT use random seed. Change to fixed seed. ### Motivation and Context Fix flaky UT.

[webgpu] Fix alignment issues in shader code (microsoft#23776)

c6664e2

### Description This commit fixes alignment issues in shader code. ### Motivation and Context See above.

upgrade emsdk to 4.0.4 (microsoft#23819)

6df0973

### Description Upgrade EMSDK to 4.0.4 ### Motivation and Context Emscripten v4.0.4 brings 2 useful changes that are helpful for webgpu: - emscripten-core/emscripten#23678 - emscripten-core/emscripten#23631

[OVEP] Update support for Contrib Ops (microsoft#23789)

17f3947

### Description This PR enables Contrib Ops support in OVEP namely below - DynamicQuantizeMatMul, FusedMatMul, QuickGelu, SkipSimplifiedLayerNormalization Co-authored-by: n1harika <[email protected]>

Update onnxruntime_external_deps.cmake: add missing EXCLUDE_FROM_ALL (m…

b1f2a3f

…icrosoft#23829) ### Description To resolve microsoft#23821 ### Motivation and Context Similar to microsoft#23641 .

Quant tool: Add nodes_to_exclude in get_qnn_qdq_config (microsoft…

5ab953c

…#23779)

Revert changes onn mac-react-native-ci-pipeline.yml (microsoft#23845)

2a4cfab

### Description  ### Motivation and Context

Merge branch 'master' into sync_msft_29_2_25

17f4bc7

jatinwadhwa921 requested a review from ankitm3k February 28, 2025 12:58

ankitm3k approved these changes Feb 28, 2025

View reviewed changes

ankitm3k merged commit 4bb577a into ovep-develop Feb 28, 2025
6 of 12 checks passed

jatinwadhwa921 deleted the sync_msft_29_2_25 branch April 15, 2025 05:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sync With latest msft commits #600

Sync With latest msft commits #600

Uh oh!

jatinwadhwa921 commented Feb 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

Sync With latest msft commits #600

Sync With latest msft commits #600

Uh oh!

Conversation

jatinwadhwa921 commented Feb 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants