[Benchmark][feature request] Prepare for execuTorch failure handling #6391

yangw-dev · 2025-03-12T03:51:10Z

Description

Issue: #6294
Prepare mobile_job yml to generate benchmark record when job fails.

Background:

When a git benchmark job failed (or some of the mobile job failed), we need to generate a benchmark record to indicate that model has failures.

For instace, a benchmark job with name:benchmark-on-device (ic3, coreml_fp16, apple_iphone_15, arn:aws:devicefarm:us-west-2:308535385114... / mobile-job (ios)
when the whole job failed, we want to indicate that the model ic3 with backend coreml_fp16 and IOS for all metrics is failed
when one of the devices in job is failed, (IPHONE 15 with os 17.1), we want to indicate that the model ic3 with backend coreml_fp16 for IPHONE 15 with os 17.1 is failed, but others are success

key: always generate the artifact json with git job name.

Change Details

[yaml]add logic to generate artifact.json if any previous step fails and there is no expected artifact.json, this makes sure we always has the artifact json with git job name
[script] add a flag --new-json-output-format to toggle the mobile job to generate artifact.json with new format.
- see example of new json result (s3 link)
[script] add git_job_name, run_report and job_reports to artifacts.json
- git_job_name: used to build benchmark record if a git job failed [ a trick way to grab model info]
- job_reports & run_report: we currently don't have extra info about mobile job concolusions, this can be used to upload to time_series or notification system for failure details.

prs that simulate failure cases for generating logics

Mimic step failed before the benchmark test (no json generated):#6397
Mimic step benchmark test failed but with artifact: #6398
ExecuTorch Sync Test: pytorch/executorch#9204

Details

when the flag is on, artifact.json is converted from

[ 
   ....
]

to

{
   "git_job_name": str
    "artifacts":[ ],
    "run_report":{}
    "job_reports":[....]
}

This flag is temporary to in case the logics are in sync between repos.

vercel · 2025-03-12T03:51:14Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment

Name	Status	Preview	Updated (UTC)
torchci	⬜️ Ignored (Inspect)	Visit Preview	Mar 12, 2025 11:16pm

.github/workflows/mobile_job.yml

.github/workflows/test_mobile_job.yml

.github/workflows/mobile_job.yml

# Description Issue: #6294 Prepare mobile_job yml to generate benchmark record when job fails. ## Background: When a git benchmark job failed (or some of the mobile job failed), we need to generate a benchmark record to indicate that model has failures. For instace, a benchmark job with name:`benchmark-on-device (ic3, coreml_fp16, apple_iphone_15, arn:aws:devicefarm:us-west-2:308535385114... / mobile-job (ios) ` when the whole job failed, we want to indicate that the model ic3 with backend coreml_fp16 and IOS for all metrics is failed when one of the devices in job is failed, (IPHONE 15 with os 17.1), we want to indicate that the model ic3 with backend coreml_fp16 for IPHONE 15 with os 17.1 is failed, but others are success key: always generate the artifact json with git job name. ## Change Details - [yaml]add logic to generate artifact.json if any previous step fails and there is no expected artifact.json, this makes sure we always has the artifact json with git job name - [script] add a flag `--new-json-output-format` to toggle the mobile job to generate artifact.json with new format. - see example of new json result ([s3 link](https://gha-artifacts.s3.us-east-1.amazonaws.com/device_farm/13821036006/1/artifacts/ios-artifacts-38666170088.json?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=ASIAUPVRELQNEU5O2WYP%2F20250312%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20250312T212644Z&X-Amz-Expires=300&X-Amz-Security-Token=IQoJb3JpZ2luX2VjEH4aCXVzLWVhc3QtMSJHMEUCIQC7%2BkVAOsGTimttLszL6u3N4HeFdSzwmPzlOYQBh%2BU%2BzwIgNjk%2FM73TZ9YfN6W92yjuRBUevYQ1BWWf0M7rmky4IT0q0AMIx%2F%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FARAFGgwzMDg1MzUzODUxMTQiDCWs46GorlC4PkgCmCqkA7TQ41pTu7Pw2vUyPArSC95%2FUUHvRy5DCUEGOUwKmscwv%2B0D9jRdGfQ05E4dtVKliXhNnBRu2oH2u9WIPGKgR3fFjrVRvy2bzQhMYVjAqfUnG%2BhVO2hOKC6U33bMMNJ4SziagDSsAwHBRXl2YLsd9x4ToLubWcHFd4RtE5ZTFQFBHoB05KmzRJ5O00P6m%2BmzBvNh0T%2F2nj2l5c66VmBOe5xeyqEEHXsw3jD98NGrff7nQrONMDpRLjS74Hz%2Fz%2BGJL9RNwNQ2yJYSUdmkrTk4wi7ToNGrzpJm4Lh7wOprHQVwqpVnYaZjw7bJrTk4of4%2FE0%2FBsI1L3GqCxCt6kig02JKYBOy2nFNeRMR09xCSVQCvZE39zKZxrbilH%2FwBzHCS8KvqP14hhGbo%2F%2F08DWVBTZIgrQii0lNaPkB6c%2F0%2BCghTCQv1hUqhIY3avR3TquZzdZNeavNVU6is%2ByJtFpVZzCCH1AzeCRMcnJAlHdGyv9guD5q5wMpRICAihdmFnFy1LQZNAjSisMr0Z4zFfRKJzGdKSpdyL9D5O063WU0VVtmfI0U4fzCz38e%2BBjqEApAZr2cVZ87wIvVZOhcPBDmz%2F9mBgH5LSIK0bfkuZz6vhkUpJbmHbID6YjraMitF1ht1%2FgQtCQkHaejdA9y99K0KEwcT5JVEFaiJNhm5o7KvZJ1jlDqNAklD8brH63PQ705eszJeILnBAmKdOxTrqb83EEmg5Z2eSIjf7Cl04Si21S%2FZomsjHG1zlcHT4jZ9%2FzXPHNHFVmuMwqOVSTzMXx2BKHrOrtwW%2BbpQ8x8rOC5E9P85c86MSDefTk%2BC9Hoee16B45ywR%2BbH7I9fK%2FZ27v%2BCE0gHQglXCHTFVSp7mk18KQw67BJqq5nJDAQ%2BtEdezGj2O5iiG2Amto3XgUbeSRvTi7iF&X-Amz-Signature=49b1065e9246c807c434b8fd2dc510c014fb12a3ceb2605034da70ee2a64ca68&X-Amz-SignedHeaders=host&response-content-disposition=inline)) - [script] add git_job_name, run_report and job_reports to artifacts.json - git_job_name: used to build benchmark record if a git job failed [ a trick way to grab model info] - job_reports & run_report: we currently don't have extra info about mobile job concolusions, this can be used to upload to time_series or notification system for failure details. ## prs that simulate failure cases for generating logics Mimic step failed before the benchmark test (no json generated):#6397 Mimic step benchmark test failed but with artifact: #6398 ExecuTorch Sync Test: pytorch/executorch#9204 ## Details when the flag is on, artifact.json is converted from ``` [ .... ] ``` to ``` { "git_job_name": str "artifacts":[ ], "run_report":{} "job_reports":[....] } ``` This flag is temporary to in case the logics are in sync between repos.

test

b57c265

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 12, 2025

yangw-dev added 7 commits March 11, 2025 20:52

test

ad51d28

test

fecdc3e

test

3e66c45

test

9b6a38c

test

00c7dd9

add test

8f97f4a

add test

3b81aba

yangw-dev requested a review from ZainRizvi March 12, 2025 05:08

yangw-dev marked this pull request as ready for review March 12, 2025 05:08

yangw-dev added 5 commits March 11, 2025 22:41

fix bug

c6ab224

fix bug

d2337f9

fix bug

bbaa18c

fix bug

1314a61

fix bug

af37d5a

github-advanced-security bot found potential problems Mar 12, 2025

View reviewed changes

yangw-dev marked this pull request as draft March 12, 2025 19:12

yangw-dev removed the request for review from ZainRizvi March 12, 2025 19:12

add echo

6fcec07

github-advanced-security bot found potential problems Mar 12, 2025

View reviewed changes

.github/workflows/mobile_job.yml Fixed Show fixed Hide fixed

yangw-dev added 2 commits March 12, 2025 12:29

add echo

555ab35

add echo

a7c5b73

github-advanced-security bot found potential problems Mar 12, 2025

View reviewed changes

yangw-dev added 3 commits March 12, 2025 12:37

add echo

c66ca53

add echo

a930214

add echo

f289f71

yangw-dev changed the title ~~[Benchmark] Add flag to enbale/disable new json format in mobile artifacts file~~ [Benchmark] Prepare for execuTorch failure handling Mar 12, 2025

yangw-dev requested a review from ZainRizvi March 12, 2025 21:02

yangw-dev requested a review from huydhn March 12, 2025 21:03

yangw-dev marked this pull request as ready for review March 12, 2025 21:03

type

444eee9

ZainRizvi approved these changes Mar 12, 2025

View reviewed changes

.github/workflows/test_mobile_job.yml Show resolved Hide resolved

.github/workflows/test_mobile_job.yml Outdated Show resolved Hide resolved

huydhn reviewed Mar 12, 2025

View reviewed changes

.github/workflows/mobile_job.yml Outdated Show resolved Hide resolved

yangw-dev added 3 commits March 12, 2025 16:14

type

8047839

type

690a42c

type

633fbb2

huydhn approved these changes Mar 12, 2025

View reviewed changes

yangw-dev merged commit 136177d into main Mar 13, 2025
10 checks passed

yangw-dev deleted the addNewJsonOutput branch March 13, 2025 00:37

huydhn mentioned this pull request Apr 2, 2025

Try upgrading to CoreML Tools 8.2 pytorch/executorch#9807

Merged

yangw-dev changed the title ~~[Benchmark] Prepare for execuTorch failure handling~~ [Benchmark][feature request] Prepare for execuTorch failure handling May 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Benchmark][feature request] Prepare for execuTorch failure handling #6391

[Benchmark][feature request] Prepare for execuTorch failure handling #6391

Uh oh!

yangw-dev commented Mar 12, 2025 •

edited

Loading

Uh oh!

vercel bot commented Mar 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Benchmark][feature request] Prepare for execuTorch failure handling #6391

[Benchmark][feature request] Prepare for execuTorch failure handling #6391

Uh oh!

Conversation

yangw-dev commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Background:

Change Details

prs that simulate failure cases for generating logics

Details

Uh oh!

vercel bot commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yangw-dev commented Mar 12, 2025 •

edited

Loading

vercel bot commented Mar 12, 2025 •

edited

Loading