Offload host2 #142696

ZuseZ4 · 2025-06-18T21:54:22Z

A follow-up to my previous gpu host PR. With this, I can (in theory) run a sufficiently simple Rust function on GPUs. I tested it on AMD, where the amdgcn tartget of rustc causes issues due to Addressspace castings, which might not be valid. If I (manually) fix them, I can run the generated IR on an AMD GPU. This should conceptually also work on NVIDIA or Intel. I updated the dev-guide acordingly: https://rustc-dev-guide.rust-lang.org/offload/usage.html

I am unhappy with the amount of standalone functions in my offload code, so in my second commit I bundled some of the code around two structs which are Rust versions of the LLVM/Offload structs which they represent. The structs themselves only have doc comments. Since I directly lower everything to llvm-ir I didn't saw a big value in modelling the struct member variables.

bors · 2025-06-26T03:40:41Z

☔ The latest upstream changes (presumably #143026) made this pull request unmergeable. Please resolve the merge conflicts.

compiler/rustc_codegen_llvm/src/builder/gpu_wrapper.rs

ZuseZ4 · 2025-07-22T19:49:34Z

We'll still need #143684 to properly recognize our GPU hardware and run the binary on end-user hardware, but here I'll only add codegen tests, so it should work fine for CI.

ZuseZ4 · 2025-07-29T21:17:19Z

related (also wip) rustc-dev-guide update: rust-lang/rustc-dev-guide#2524

Offload host2 r? `@oli-obk` A follow-up to my previous gpu host PR. With this, I can (in theory) run a sufficiently simple Rust function on GPUs. I tested it on AMD, where the amdgcn tartget of rustc causes issues due to Addressspace castings, which might not be valid. If I (manually) fix them, I can run the generated IR on an AMD GPU. This should conceptually also work on NVIDIA or Intel. I updated the dev-guide acordingly: https://rustc-dev-guide.rust-lang.org/offload/usage.html I am unhappy with the amount of standalone functions in my offload code, so in my second commit I bundled some of the code around two structs which are Rust versions of the LLVM/Offload structs which they represent. The structs themselves only have doc comments. Since I directly lower everything to llvm-ir I didn't saw a big value in modelling the struct member variables.

bors · 2025-10-20T08:26:11Z

⌛ Testing commit 5bb815a with merge d1b279d...

bors · 2025-10-20T08:31:34Z

💔 Test failed - checks-actions

rust-log-analyzer · 2025-10-20T08:36:52Z

A job failed! Check out the build log: (web) (plain enhanced) (plain)

Click to see the possible cause of the failure (guessed by this bot)

#4 [auth] library/ubuntu:pull token for registry-1.docker.io
#4 DONE 0.0s

#3 [internal] load metadata for docker.io/library/ubuntu:25.10
#3 ERROR: failed to authorize: failed to fetch oauth token: unexpected status from POST request to https://auth.docker.io/token: 503 Service Unavailable: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>


------
---
   1 | >>> FROM ubuntu:25.10
   2 |     
   3 |     ARG DEBIAN_FRONTEND=noninteractive
--------------------
ERROR: failed to build: failed to solve: failed to fetch oauth token: unexpected status from POST request to https://auth.docker.io/token: 503 Service Unavailable: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>


Command failed. Attempt 2/5:
---
#2 [auth] library/ubuntu:pull token for registry-1.docker.io
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/library/ubuntu:25.10
#3 ERROR: failed to authorize: failed to fetch oauth token: unexpected status from POST request to https://auth.docker.io/token: 503 Service Unavailable: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>


------
---
   1 | >>> FROM ubuntu:25.10
   2 |     
   3 |     ARG DEBIAN_FRONTEND=noninteractive
--------------------
ERROR: failed to build: failed to solve: failed to fetch oauth token: unexpected status from POST request to https://auth.docker.io/token: 503 Service Unavailable: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>


Command failed. Attempt 3/5:
---
#2 [auth] library/ubuntu:pull token for registry-1.docker.io
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/library/ubuntu:25.10
#3 ERROR: failed to authorize: failed to fetch oauth token: unexpected status from POST request to https://auth.docker.io/token: 503 Service Unavailable: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>


------
---
   1 | >>> FROM ubuntu:25.10
   2 |     
   3 |     ARG DEBIAN_FRONTEND=noninteractive
--------------------
ERROR: failed to build: failed to solve: failed to fetch oauth token: unexpected status from POST request to https://auth.docker.io/token: 503 Service Unavailable: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>


Command failed. Attempt 4/5:
---
#2 [auth] library/ubuntu:pull token for registry-1.docker.io
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/library/ubuntu:25.10
#3 ERROR: failed to authorize: failed to fetch oauth token: unexpected status from POST request to https://auth.docker.io/token: 503 Service Unavailable: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>


------
---
   1 | >>> FROM ubuntu:25.10
   2 |     
   3 |     ARG DEBIAN_FRONTEND=noninteractive
--------------------
ERROR: failed to build: failed to solve: failed to fetch oauth token: unexpected status from POST request to https://auth.docker.io/token: 503 Service Unavailable: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>


Command failed. Attempt 5/5:
---
#2 [auth] library/ubuntu:pull token for registry-1.docker.io
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/library/ubuntu:25.10
#3 ERROR: failed to authorize: failed to fetch oauth token: unexpected status from POST request to https://auth.docker.io/token: 503 Service Unavailable: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>


------
---
   1 | >>> FROM ubuntu:25.10
   2 |     
   3 |     ARG DEBIAN_FRONTEND=noninteractive
--------------------
ERROR: failed to build: failed to solve: failed to fetch oauth token: unexpected status from POST request to https://auth.docker.io/token: 503 Service Unavailable: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>


The command has failed after 5 attempts.

ZuseZ4 · 2025-10-20T09:08:14Z

spurious failure.
@bors r=oli-obk

bors · 2025-10-20T09:09:29Z

⌛ Testing commit 5bb815a with merge 2fa5c5a...

Offload host2 r? `@oli-obk` A follow-up to my previous gpu host PR. With this, I can (in theory) run a sufficiently simple Rust function on GPUs. I tested it on AMD, where the amdgcn tartget of rustc causes issues due to Addressspace castings, which might not be valid. If I (manually) fix them, I can run the generated IR on an AMD GPU. This should conceptually also work on NVIDIA or Intel. I updated the dev-guide acordingly: https://rustc-dev-guide.rust-lang.org/offload/usage.html I am unhappy with the amount of standalone functions in my offload code, so in my second commit I bundled some of the code around two structs which are Rust versions of the LLVM/Offload structs which they represent. The structs themselves only have doc comments. Since I directly lower everything to llvm-ir I didn't saw a big value in modelling the struct member variables.

bors · 2025-10-20T09:14:46Z

💔 Test failed - checks-actions

rust-log-analyzer · 2025-10-20T09:20:59Z

A job failed! Check out the build log: (web) (plain enhanced) (plain)

Click to see the possible cause of the failure (guessed by this bot)

#3 [auth] library/centos:pull token for registry-1.docker.io
#3 DONE 0.0s

#4 [internal] load metadata for docker.io/library/centos:7
#4 ERROR: failed to authorize: failed to fetch oauth token: unexpected status from POST request to https://auth.docker.io/token: 503 Service Unavailable: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>


------
---
   5 | >>> FROM centos:7
   6 |     
   7 |     WORKDIR /build
--------------------
ERROR: failed to build: failed to solve: failed to fetch oauth token: unexpected status from POST request to https://auth.docker.io/token: 503 Service Unavailable: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>


Command failed. Attempt 2/5:
---
#2 [auth] library/centos:pull token for registry-1.docker.io
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/library/centos:7
#3 ERROR: failed to authorize: failed to fetch oauth token: unexpected status from POST request to https://auth.docker.io/token: 503 Service Unavailable: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>


------
---
   5 | >>> FROM centos:7
   6 |     
   7 |     WORKDIR /build
--------------------
ERROR: failed to build: failed to solve: failed to fetch oauth token: unexpected status from POST request to https://auth.docker.io/token: 503 Service Unavailable: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>


Command failed. Attempt 3/5:
---
#2 [auth] library/centos:pull token for registry-1.docker.io
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/library/centos:7
#3 ERROR: failed to authorize: failed to fetch oauth token: unexpected status from POST request to https://auth.docker.io/token: 503 Service Unavailable: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>


------
---
   5 | >>> FROM centos:7
   6 |     
   7 |     WORKDIR /build
--------------------
ERROR: failed to build: failed to solve: failed to fetch oauth token: unexpected status from POST request to https://auth.docker.io/token: 503 Service Unavailable: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>


Command failed. Attempt 4/5:
---
#2 [auth] library/centos:pull token for registry-1.docker.io
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/library/centos:7
#3 ERROR: failed to authorize: failed to fetch oauth token: unexpected status from POST request to https://auth.docker.io/token: 503 Service Unavailable: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>


------
---
   5 | >>> FROM centos:7
   6 |     
   7 |     WORKDIR /build
--------------------
ERROR: failed to build: failed to solve: failed to fetch oauth token: unexpected status from POST request to https://auth.docker.io/token: 503 Service Unavailable: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>


Command failed. Attempt 5/5:
---
#2 [auth] library/centos:pull token for registry-1.docker.io
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/library/centos:7
#3 ERROR: failed to authorize: failed to fetch oauth token: unexpected status from POST request to https://auth.docker.io/token: 503 Service Unavailable: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>


------
---
   5 | >>> FROM centos:7
   6 |     
   7 |     WORKDIR /build
--------------------
ERROR: failed to build: failed to solve: failed to fetch oauth token: unexpected status from POST request to https://auth.docker.io/token: 503 Service Unavailable: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>


The command has failed after 5 attempts.

Zalathar · 2025-10-20T10:16:18Z

Docker appears to be working again.

@bors retry

bors · 2025-10-20T10:17:33Z

⌛ Testing commit 5bb815a with merge fd847d4...

bors · 2025-10-20T13:30:40Z

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing fd847d4 to master...

github-actions · 2025-10-20T13:34:10Z

What is this?

This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing ebe145e (parent) -> fd847d4 (this PR)

Test differences

No test diffs found

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard fd847d4d5d5d1e96bde2d97635faec8655da6b18 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

dist-aarch64-apple: 5865.2s -> 6773.2s (15.5%)
dist-aarch64-msvc: 5600.0s -> 6329.2s (13.0%)
x86_64-gnu-miri: 3999.1s -> 4501.4s (12.6%)
i686-msvc-2: 7889.1s -> 7060.9s (-10.5%)
dist-apple-various: 3486.4s -> 3843.8s (10.3%)
aarch64-apple: 8204.9s -> 7401.4s (-9.8%)
aarch64-gnu-debug: 3918.1s -> 3620.2s (-7.6%)
pr-check-1: 1574.7s -> 1468.2s (-6.8%)
dist-ohos-aarch64: 4346.8s -> 4622.4s (6.3%)
x86_64-mingw-2: 8093.2s -> 7581.7s (-6.3%)

How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

rust-timer · 2025-10-20T14:40:57Z

Finished benchmarking commit (fd847d4): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.3%	[-0.3%, -0.3%]	1
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results (secondary 3.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	3.5%	[1.8%, 5.5%]	4
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 473.415s -> 473.5s (0.02%)
Artifact size: 388.68 MiB -> 388.65 MiB (-0.01%)

Offload host2 r? `@oli-obk` A follow-up to my previous gpu host PR. With this, I can (in theory) run a sufficiently simple Rust function on GPUs. I tested it on AMD, where the amdgcn tartget of rustc causes issues due to Addressspace castings, which might not be valid. If I (manually) fix them, I can run the generated IR on an AMD GPU. This should conceptually also work on NVIDIA or Intel. I updated the dev-guide acordingly: https://rustc-dev-guide.rust-lang.org/offload/usage.html I am unhappy with the amount of standalone functions in my offload code, so in my second commit I bundled some of the code around two structs which are Rust versions of the LLVM/Offload structs which they represent. The structs themselves only have doc comments. Since I directly lower everything to llvm-ir I didn't saw a big value in modelling the struct member variables.

rustbot assigned oli-obk Jun 18, 2025

rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. F-autodiff `#![feature(autodiff)]` T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 18, 2025

This comment has been minimized.

Sign in to view

bors added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jun 26, 2025

ZuseZ4 force-pushed the offload-device1 branch from 100f9f3 to 7cf6e76 Compare July 2, 2025 23:56

ZuseZ4 force-pushed the offload-device1 branch from 7cf6e76 to 65bd406 Compare July 10, 2025 00:45

ZuseZ4 commented Jul 10, 2025

View reviewed changes

compiler/rustc_codegen_llvm/src/builder/gpu_wrapper.rs Outdated Show resolved Hide resolved

ZuseZ4 force-pushed the offload-device1 branch from 65bd406 to e30f77b Compare July 10, 2025 22:10

ZuseZ4 force-pushed the offload-device1 branch from e30f77b to 4273fb1 Compare July 22, 2025 19:47

This comment has been minimized.

Sign in to view

rustbot added the A-run-make Area: port run-make Makefiles to rmake.rs label Jul 24, 2025

This comment has been minimized.

Sign in to view

ZuseZ4 mentioned this pull request Jul 24, 2025

Tracking Issue for GPU-offload #131513

Open

5 tasks

ZuseZ4 force-pushed the offload-device1 branch from 28b9090 to c7a65a1 Compare July 29, 2025 20:45

This comment has been minimized.

Sign in to view

ZuseZ4 mentioned this pull request Jul 29, 2025

add gpu device side instructions rust-lang/rustc-dev-guide#2524

Merged

ZuseZ4 mentioned this pull request Jul 30, 2025

Finish the std::offload module rust-lang/rust-project-goals#109

Open

8 tasks

ZuseZ4 force-pushed the offload-device1 branch 2 times, most recently from c6ca7f4 to cc13fc3 Compare July 31, 2025 22:17

This comment has been minimized.

Sign in to view

ZuseZ4 force-pushed the offload-device1 branch from a4a0cb7 to 7ab1ff2 Compare August 7, 2025 00:21

This comment has been minimized.

Sign in to view

ZuseZ4 force-pushed the offload-device1 branch from 7ab1ff2 to 0eb20c3 Compare August 7, 2025 23:51

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 20, 2025

bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Oct 20, 2025

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 20, 2025

bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Oct 20, 2025

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 20, 2025

bors added the merged-by-bors This PR was explicitly merged by bors. label Oct 20, 2025

bors merged commit fd847d4 into rust-lang:master Oct 20, 2025
12 checks passed

rustbot added this to the 1.92.0 milestone Oct 20, 2025

bors mentioned this pull request Oct 20, 2025

Offload device2 #145688

Draft

Offload host2 #142696

Offload host2 #142696

Uh oh!

Conversation

ZuseZ4 commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment has been minimized.

bors commented Jun 26, 2025

Uh oh!

Uh oh!

ZuseZ4 commented Jul 22, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

ZuseZ4 commented Jul 29, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

bors commented Oct 20, 2025

Uh oh!

bors commented Oct 20, 2025

Uh oh!

rust-log-analyzer commented Oct 20, 2025

Uh oh!

ZuseZ4 commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bors commented Oct 20, 2025

Uh oh!

bors commented Oct 20, 2025

Uh oh!

rust-log-analyzer commented Oct 20, 2025

Uh oh!

Zalathar commented Oct 20, 2025

Uh oh!

bors commented Oct 20, 2025

Uh oh!

bors commented Oct 20, 2025

Uh oh!

Uh oh!

github-actions bot commented Oct 20, 2025

Test differences

Job duration changes

Uh oh!

rust-timer commented Oct 20, 2025

Overall result: ✅ improvements - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Binary size

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

ZuseZ4 commented Jun 18, 2025 •

edited

Loading

ZuseZ4 commented Oct 20, 2025 •

edited

Loading