Skip to content

Commit a4cbdba

Browse files
committed
Auto merge of #119290 - Kobzol:ci-docker-registry-cache, r=<try>
Cache CI Docker images in ghcr registry This PR changes the way `rust-lang` caches Docker images used in CI workflows. Before, the intermediate Docker layers were manually exported from `docker history` and backed up in S3. However, this approach doesn't work any more with the Docker version used by GitHub Actions since August 2023. We had to revert to disabling Docker BuildKit to make the old caching work, but this workaround will stop working eventually, after GitHub updates Docker again and the old build backend will be removed. This PR changes the caching to use [Docker caching](https://docs.docker.com/build/cache/) instead. There are several backends for the cache, for our use-case S3 and Docker registry makes sense. This PR uses the Docker registry backend and uses the ghcr.io registry. The caching creates a Docker image labeled `rust-ci`, which is currently stored to the `ghcr.io/rust-lang-ci` package registry. This image appears [here](https://ghcr.io/rust-lang-ci/rust-ci). The image is stored in `rust-lang-ci` and not `rust-lang`, because `try` and `auto` builds run in the context of that repository, so the used `GITHUB_TOKEN` has permissions for it (unlike for `rust-lang`). For pull request CI runs, the provided `GITHUB_TOKEN` reduces its permissions automatically to `packages: read`, which means that we won't be able to write the Docker image. If we're not able to write, we won't have anything to read. So I disabled the caching entirely for PR runs (it makes it slightly faster to build the Docker image if we don't have to deal with exporting and using a separate build driver). Note that before this PR, we also weren't able to read or write the cache on PR runs. Related issue: rust-lang/infra-team#81 r? `@Mark-Simulacrum`
2 parents 1a7e97f + 6f90144 commit a4cbdba

File tree

3 files changed

+37
-47
lines changed

3 files changed

+37
-47
lines changed

.github/workflows/ci.yml

+7
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ jobs:
4242
CI_JOB_NAME: "${{ matrix.name }}"
4343
CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
4444
HEAD_SHA: "${{ github.event.pull_request.head.sha || github.sha }}"
45+
DOCKER_TOKEN: "${{ secrets.GITHUB_TOKEN }}"
4546
SCCACHE_BUCKET: rust-lang-ci-sccache2
4647
TOOLSTATE_REPO: "https://github.com/rust-lang-nursery/rust-toolstate"
4748
CACHE_DOMAIN: ci-caches.rust-lang.org
@@ -168,10 +169,13 @@ jobs:
168169
if: "success() && !env.SKIP_JOB && (github.event_name == 'push' || env.DEPLOY == '1' || env.DEPLOY_ALT == '1')"
169170
auto:
170171
name: "auto - ${{ matrix.name }}"
172+
permissions:
173+
packages: write
171174
env:
172175
CI_JOB_NAME: "${{ matrix.name }}"
173176
CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
174177
HEAD_SHA: "${{ github.event.pull_request.head.sha || github.sha }}"
178+
DOCKER_TOKEN: "${{ secrets.GITHUB_TOKEN }}"
175179
SCCACHE_BUCKET: rust-lang-ci-sccache2
176180
DEPLOY_BUCKET: rust-lang-ci2
177181
TOOLSTATE_REPO: "https://github.com/rust-lang-nursery/rust-toolstate"
@@ -561,11 +565,14 @@ jobs:
561565
if: "success() && !env.SKIP_JOB && (github.event_name == 'push' || env.DEPLOY == '1' || env.DEPLOY_ALT == '1')"
562566
try:
563567
name: "try - ${{ matrix.name }}"
568+
permissions:
569+
packages: write
564570
env:
565571
DIST_TRY_BUILD: 1
566572
CI_JOB_NAME: "${{ matrix.name }}"
567573
CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
568574
HEAD_SHA: "${{ github.event.pull_request.head.sha || github.sha }}"
575+
DOCKER_TOKEN: "${{ secrets.GITHUB_TOKEN }}"
569576
SCCACHE_BUCKET: rust-lang-ci-sccache2
570577
DEPLOY_BUCKET: rust-lang-ci2
571578
TOOLSTATE_REPO: "https://github.com/rust-lang-nursery/rust-toolstate"

src/ci/docker/run.sh

+25-47
Original file line numberDiff line numberDiff line change
@@ -74,25 +74,6 @@ if [ -f "$docker_dir/$image/Dockerfile" ]; then
7474

7575
cksum=$(sha512sum $hash_key | \
7676
awk '{print $1}')
77-
78-
url="https://$CACHE_DOMAIN/docker/$cksum"
79-
80-
echo "Attempting to download $url"
81-
rm -f /tmp/rustci_docker_cache
82-
set +e
83-
retry curl --max-time 600 -y 30 -Y 10 --connect-timeout 30 -f -L -C - \
84-
-o /tmp/rustci_docker_cache "$url"
85-
86-
docker_archive_hash=$(sha512sum /tmp/rustci_docker_cache | awk '{print $1}')
87-
echo "Downloaded archive hash: ${docker_archive_hash}"
88-
89-
echo "Loading images into docker"
90-
# docker load sometimes hangs in the CI, so time out after 10 minutes with TERM,
91-
# KILL after 12 minutes
92-
loaded_images=$(/usr/bin/timeout -k 720 600 docker load -i /tmp/rustci_docker_cache \
93-
| sed 's/.* sha/sha/')
94-
set -e
95-
printf "Downloaded containers:\n$loaded_images\n"
9677
fi
9778

9879
dockerfile="$docker_dir/$image/Dockerfile"
@@ -103,39 +84,36 @@ if [ -f "$docker_dir/$image/Dockerfile" ]; then
10384
context="$script_dir"
10485
fi
10586
echo "::group::Building docker image for $image"
87+
echo "Image checksum ${cksum}"
10688

107-
# As of August 2023, Github Actions have updated Docker to 23.X,
108-
# which uses the BuildKit by default. It currently throws aways all
109-
# intermediate layers, which breaks our usage of S3 layer caching.
110-
# Therefore we opt-in to the old build backend for now.
111-
export DOCKER_BUILDKIT=0
112-
retry docker \
113-
build \
114-
--rm \
115-
-t rust-ci \
116-
-f "$dockerfile" \
117-
"$context"
89+
# On PR jobs, we don't have permissions to write to the cache, so we should not use
90+
# `docker login` nor caching.
91+
if [ "$PR_CI_JOB" -eq 1 ]
92+
then
93+
retry docker build --rm -t rust-ci -f "$dockerfile" "$context"
94+
else
95+
docker buildx create --use --driver docker-container
96+
97+
# Login to Docker registry
98+
echo ${DOCKER_TOKEN} | docker login ghcr.io --username rust-lang-ci --password-stdin
99+
100+
dest="type=registry,ref=ghcr.io/rust-lang-ci/rust-ci:${cksum},compression=zstd,mode=min"
101+
102+
retry docker \
103+
buildx \
104+
build \
105+
--rm \
106+
-t rust-ci \
107+
-f "$dockerfile" \
108+
--cache-from type=registry,ref=ghcr.io/rust-lang-ci/rust-ci:${cksum} \
109+
--cache-to ${dest} \
110+
--output=type=docker \
111+
"$context"
112+
fi
118113
echo "::endgroup::"
119114

120115
if [ "$CI" != "" ]; then
121-
s3url="s3://$SCCACHE_BUCKET/docker/$cksum"
122-
upload="aws s3 cp - $s3url"
123116
digest=$(docker inspect rust-ci --format '{{.Id}}')
124-
echo "Built container $digest"
125-
if ! grep -q "$digest" <(echo "$loaded_images"); then
126-
echo "Uploading finished image $digest to $url"
127-
set +e
128-
# Print image history for easier debugging of layer SHAs
129-
docker history rust-ci
130-
docker history -q rust-ci | \
131-
grep -v missing | \
132-
xargs docker save | \
133-
gzip | \
134-
$upload
135-
set -e
136-
else
137-
echo "Looks like docker image is the same as before, not uploading"
138-
fi
139117
# Record the container image for reuse, e.g. by rustup.rs builds
140118
info="$dist/image-$image.txt"
141119
mkdir -p "$dist"

src/ci/github-actions/ci.yml

+5
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ x--expand-yaml-anchors--remove:
3434
CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
3535
# commit of PR sha or commit sha. `GITHUB_SHA` is not accurate for PRs.
3636
HEAD_SHA: ${{ github.event.pull_request.head.sha || github.sha }}
37+
DOCKER_TOKEN: ${{ secrets.GITHUB_TOKEN }}
3738

3839
- &public-variables
3940
SCCACHE_BUCKET: rust-lang-ci-sccache2
@@ -345,6 +346,8 @@ jobs:
345346
auto:
346347
<<: *base-ci-job
347348
name: auto - ${{ matrix.name }}
349+
permissions:
350+
packages: write
348351
env:
349352
<<: [*shared-ci-variables, *prod-variables]
350353
if: github.event_name == 'push' && github.ref == 'refs/heads/auto' && github.repository == 'rust-lang-ci/rust'
@@ -725,6 +728,8 @@ jobs:
725728
try:
726729
<<: *base-ci-job
727730
name: try - ${{ matrix.name }}
731+
permissions:
732+
packages: write
728733
env:
729734
DIST_TRY_BUILD: 1
730735
<<: [*shared-ci-variables, *prod-variables]

0 commit comments

Comments
 (0)