Skip to content

cmd/dist: check that builds are reproducible #58884

@rsc

Description

@rsc

For #24904, we have made changes to Go 1.21 to make the Go distribution builds trivially reproducible: given the source archive and a Go bootstrap toolchain that is new enough (Go 1.21 requires Go 1.17 or later), running make.bash or make.bat should produce the same binaries no matter the details of the host system, and adding the -distpack flag should produce the same archives no matter the details of the host system. Setting GOOS and GOARCH during this command cross-compiles a distribution for another system, and those should also be the same archives no matter the details of the host system. For example, building a linux/amd64 distribution should produce the same archive no matter whether the build happens on Linux, macOS, or Windows; no matter whether the host system is an x86 or an arm64; no matter where the archive is extracted; and so on.

The source archive https://swtch.com/tmp/go1.21repro4.src.tar.gz holds the source tree for a Go release claming to be go1.21repro4. The first 64 bits of its SHA256 hash are 0322e4c62dd8d770 (use openssl sha256 go1.21repro4.src.tar.gz on Unix or certutil -hashfile go1.21repro4.src.tar.gz sha256 on Windows).

If you expand that source archive and cd into go/src and run ./repro.bash (or repro.bat on Windows), it will run for quite a while building distribution archives for various systems. If you save the output and pipe it through grep distpack:, the output should match the canonical hashes here. For example:

curl -O https://swtch.com/tmp/go1.21repro4.src.tar.gz
tar xzf go1.21repro4.src.tar.gz
cd go/src
./repro.bash 2>&1 | tee ../../repro.txt  # DO NOT WRITE TO LOCAL DIRECTORY
cd ../..
grep distpack: repro.txt >distpack.txt
curl -O https://swtch.com/tmp/distpack-golden.txt
diff distpack-golden.txt distpack.txt

or on Windows:

curl -O https://swtch.com/tmp/go1.21repro4.src.tar.gz
tar xzf go1.21repro4.src.tar.gz
cd go\src
repro.bat >..\..\repro.txt  ;; DO NOT WRITE TO LOCAL DIRECTORY
cd ..\..
find "distpack:" repro.txt >distpack.txt
curl -O https://swtch.com/tmp/distpack-golden-crlf.txt
fc distpack-golden-crlf.txt distpack.txt

As noted in the comments, do not write repro.txt into the current directory, or else it will be included in the archives and affect their hashes.

You can test a single build using

cd go/src
GOOS=<goos> GOARCH=<goarch> ./make.bash -distpack

Omit GOOS and GOARCH to test the build for the local system. Note that the canonical linux-arm build also sets GOARM=6. Other variables like CC and CGO_ENABLED should be unset.

If you find a system configuration where the script runs a build successfully but produces a different archive hash than the canonical ones, please check what is different by comparing against the reference archives. Good ways to identify differences include:

  • using unzip -lv on each archive and diffing those outputs
  • using tar tzvf on each archive and diffing those outputs
  • unpacking each archive into its own directory and using diff -r

Sometimes the difference will be in your environment configuration, such as setting CC or CGO_ENABLED causing changes in the defaults baked into the toolchain. Those kinds of differences due to configuration are expected. If you find a difference that's not caused by Go configuration, please file an issue with subject cmd/distpack: reproducibility bug for GOOS/GOARCH (filling in GOOS and GOARCH) along with details of which files are different, and then mention the issue in a comment on this issue as well.

I have already tested repro.bash on darwin/amd64, darwin/arm64, linux/amd64, and windows/arm64 systems, and I've tested every possible Go release from Go 1.17 onward as bootstrap toolchain when building on darwin/amd64. Of course, there may well still be bugs in Go setups I have not thought to test, and if so we want to find them. If there are any remaining, the first person to identify each new reproducibility bug root cause wins a gopher.

Metadata

Metadata

Assignees

Labels

NeedsFixThe path to resolution is known, but the work has not been done.help wanted

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions