Skip to content

Update run-readme-pr-linuxaarch64.yml to use correct runner #1469

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jan 23, 2025
Merged
74 changes: 32 additions & 42 deletions .github/workflows/run-readme-pr-linuxaarch64.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,22 +9,20 @@ on:

jobs:
test-readme-cpu:
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
uses: pytorch/test-infra/.github/workflows/linux_job_v2.yml@main
permissions:
id-token: write
contents: read
with:
runner: linux-aarch64
gpu-arch-type: cuda
gpu-arch-version: "12.1"
runner: linux.arm64.2xlarge
docker-image: "pytorch/manylinux2_28_aarch64-builder:cpu-aarch64-main"
gpu-arch-type: cpu-aarch64
timeout: 60
Copy link

@atalman atalman Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try passing

docker-image: "pytorch/manylinuxaarch64-builder:cuda12.1-main"

Looks like the error:
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested

Is related to the fact that this is using docker-image=pytorch/conda-builder:cuda12.1 image by default which is not correct for linux.arm64.m7g.4xlarge runner

Copy link
Contributor Author

@Jack-Khuu Jack-Khuu Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't look like it can find the Docker-image verbatim, testing with the 12.6 version found in pt/pt

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If using linux_job_v2.yml you can try using latest image pytorch/manylinux2_28_aarch64-builder:cuda12.6

Copy link
Contributor Author

@Jack-Khuu Jack-Khuu Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't look like the cuda version is there manylinux2_28_aarch64-builder:cuda12.6, but the CPU variant :cpu-aarch64-main with linux_job_v2 seems to be the right track

Now we're just down to missing devtoolset-10-binutils, which is curious since pt/pt uses v10 for aarch64
Edit: Resolved; the pip installs were unnecessary

script: |
echo "::group::Print machine info"
uname -a
echo "::endgroup::"

echo "::group::Install newer objcopy that supports --set-section-alignment"
yum install -y devtoolset-10-binutils
export PATH=/opt/rh/devtoolset-10/root/usr/bin/:$PATH
echo "::endgroup::"

TORCHCHAT_DEVICE=cpu .ci/scripts/run-docs readme

echo "::group::Completion"
Expand All @@ -33,41 +31,37 @@ jobs:
echo "::endgroup::"

test-quantization-cpu:
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
uses: pytorch/test-infra/.github/workflows/linux_job_v2.yml@main
permissions:
id-token: write
contents: read
with:
runner: linux-aarch64
gpu-arch-type: cuda
gpu-arch-version: "12.1"
runner: linux.arm64.2xlarge
docker-image: "pytorch/manylinux2_28_aarch64-builder:cpu-aarch64-main"
gpu-arch-type: cpu-aarch64
timeout: 60
script: |
echo "::group::Print machine info"
uname -a
echo "::endgroup::"

echo "::group::Install newer objcopy that supports --set-section-alignment"
yum install -y devtoolset-10-binutils
export PATH=/opt/rh/devtoolset-10/root/usr/bin/:$PATH
echo "::endgroup::"

TORCHCHAT_DEVICE=cpu .ci/scripts/run-docs quantization

test-gguf-cpu:
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
permissions:
id-token: write
contents: read
with:
runner: linux-aarch64
gpu-arch-type: cuda
gpu-arch-version: "12.1"
runner: linux.arm64.2xlarge
docker-image: "pytorch/manylinux2_28_aarch64-builder:cpu-aarch64-main"
gpu-arch-type: cpu-aarch64
timeout: 60
script: |
echo "::group::Print machine info"
uname -a
echo "::endgroup::"

echo "::group::Install newer objcopy that supports --set-section-alignment"
yum install -y devtoolset-10-binutils
export PATH=/opt/rh/devtoolset-10/root/usr/bin/:$PATH
echo "::endgroup::"

TORCHCHAT_DEVICE=cpu .ci/scripts/run-docs gguf

echo "::group::Completion"
Expand All @@ -77,21 +71,19 @@ jobs:

test-advanced-cpu:
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
permissions:
id-token: write
contents: read
with:
runner: linux-aarch64
gpu-arch-type: cuda
gpu-arch-version: "12.1"
runner: linux.arm64.2xlarge
docker-image: "pytorch/manylinux2_28_aarch64-builder:cpu-aarch64-main"
gpu-arch-type: cpu-aarch64
timeout: 60
script: |
echo "::group::Print machine info"
uname -a
echo "::endgroup::"

echo "::group::Install newer objcopy that supports --set-section-alignment"
yum install -y devtoolset-10-binutils
export PATH=/opt/rh/devtoolset-10/root/usr/bin/:$PATH
echo "::endgroup::"

TORCHCHAT_DEVICE=cpu .ci/scripts/run-docs advanced

echo "::group::Completion"
Expand All @@ -101,21 +93,19 @@ jobs:

test-evaluation-cpu:
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
permissions:
id-token: write
contents: read
with:
runner: linux-aarch64
gpu-arch-type: cuda
gpu-arch-version: "12.1"
runner: linux.arm64.2xlarge
docker-image: "pytorch/manylinux2_28_aarch64-builder:cpu-aarch64-main"
gpu-arch-type: cpu-aarch64
timeout: 60
script: |
echo "::group::Print machine info"
uname -a
echo "::endgroup::"

echo "::group::Install newer objcopy that supports --set-section-alignment"
yum install -y devtoolset-10-binutils
export PATH=/opt/rh/devtoolset-10/root/usr/bin/:$PATH
echo "::endgroup::"

TORCHCHAT_DEVICE=cpu .ci/scripts/run-docs evaluation

echo "::group::Completion"
Expand Down
Loading