|
| 1 | +--- |
| 2 | +title: "How Go Mitigates Supply Chain Attacks" |
| 3 | +date: 2022-03-31 |
| 4 | +by: |
| 5 | +- Filippo Valsorda |
| 6 | +summary: Go tooling and design help mitigate supply chain attacks at various stages. |
| 7 | +--- |
| 8 | + |
| 9 | +Modern software engineering is collaborative, and based on reusing Open Source |
| 10 | +software. |
| 11 | +That exposes targets to supply chain attacks, where software projects are |
| 12 | +attacked by compromising their dependencies. |
| 13 | + |
| 14 | +Despite any process or technical measure, every dependency is unavoidably a |
| 15 | +trust relationship. |
| 16 | +However, the Go tooling and design help mitigate risk at various stages. |
| 17 | + |
| 18 | + |
| 19 | +## All builds are “locked” |
| 20 | + |
| 21 | +There is no way for changes in the outside world—such as a new version of a |
| 22 | +dependency being published—to automatically affect a Go build. |
| 23 | + |
| 24 | +Unlike most other package managers files, Go modules don’t have a separate list |
| 25 | +of constraints and a lock file pinning specific versions. |
| 26 | +The version of every dependency contributing to any Go build is fully determined |
| 27 | +by the [`go.mod` file](https://go.dev/ref/mod#go-mod-file) of the main module. |
| 28 | + |
| 29 | +Since Go 1.16, this determinism is enforced by default, and build commands (`go |
| 30 | +build`, `go test`, `go install`, `go run`, …) [will fail if the go.mod is |
| 31 | +incomplete](https://go.dev/ref/mod#go-mod-file-updates). |
| 32 | +The only commands that will change the `go.mod` (and therefore the build) are |
| 33 | +`go get` and `go mod tidy`. |
| 34 | +These commands are not expected to be run automatically or in CI, so changes to |
| 35 | +dependency trees must be made deliberately and have the opportunity to go |
| 36 | +through code review. |
| 37 | + |
| 38 | +This is very important for security, because when a CI system or new machine |
| 39 | +runs `go build`, the checked-in source is the ultimate and complete source of |
| 40 | +truth for what will get built. |
| 41 | +There is no way for third parties to affect that. |
| 42 | + |
| 43 | +Moreover, when a dependency is added with `go get`, its transitive dependencies |
| 44 | +are added at the version specified in the dependency’s `go.mod` file, not at |
| 45 | +their latest versions, thanks to |
| 46 | +[Minimal version selection](https://go.dev/ref/mod#minimal-version-selection). |
| 47 | +The same happens for invocations of |
| 48 | +`go install example.com/cmd/devtoolx@latest`, [the equivalents of which in some |
| 49 | +ecosystems bypass pinning](https://research.swtch.com/npm-colors). |
| 50 | +In Go, the latest version of `example.com/cmd/devtoolx` will be fetched, but |
| 51 | +then all the dependencies will be set by its `go.mod` file. |
| 52 | + |
| 53 | +If a module gets compromised and a new malicious version is published, no one |
| 54 | +will be affected until they explicitly update that dependency, providing the |
| 55 | +opportunity to review the changes and time for the ecosystem to detect the |
| 56 | +event. |
| 57 | + |
| 58 | + |
| 59 | +## Version contents never change |
| 60 | + |
| 61 | +Another key property necessary to ensure third parties can’t affect builds is |
| 62 | +that the contents of a module version are immutable. |
| 63 | +If an attacker that compromises a dependency could re-upload an existing |
| 64 | +version, they could automatically compromise all projects that depend on it. |
| 65 | + |
| 66 | +That’s what the [`go.sum` file](https://go.dev/ref/mod#go-sum-files) is for. |
| 67 | +It contains a list of cryptographic hashes of each dependency that contributes |
| 68 | +to the build. |
| 69 | +Again, an incomplete <code>go.sum</code> causes an error, and only <code>go |
| 70 | +get</code> and <code>go mod tidy</code> will modify it, so any changes to it |
| 71 | +will accompany a deliberate dependency change. |
| 72 | +Other builds are guaranteed to have a full set of checksums. |
| 73 | + |
| 74 | +This is a common feature of most lock files. |
| 75 | +Go goes beyond it with the |
| 76 | +[Checksum Database](https://go.dev/ref/mod#checksum-database) (sumdb for short), |
| 77 | +a global append-only cryptographically-verifiable list of go.sum entries. |
| 78 | +When `go get` needs to add an entry to the `go.sum` file, it fetches it from the |
| 79 | +sumdb along with cryptographic proof of the sumdb integrity. |
| 80 | +This ensures that not only every build of a certain module uses the same |
| 81 | +dependency contents, but that every module out there uses the same dependency |
| 82 | +contents! |
| 83 | + |
| 84 | +The sumdb makes it impossible for compromised dependencies or even |
| 85 | +Google-operated Go infrastructure to target specific dependents with modified |
| 86 | +(e.g. backdoored) source. |
| 87 | +You’re guaranteed to be using the exact same code that everyone else who’s using |
| 88 | +e.g. v1.9.2 of `example.com/modulex` is using and has reviewed. |
| 89 | + |
| 90 | +Finally, my favorite features of the sumdb: it doesn’t require any key |
| 91 | +management on the part of module authors, and it works seamlessly with the |
| 92 | +decentralized nature of Go modules. |
| 93 | + |
| 94 | + |
| 95 | +## The VCS is the source of truth |
| 96 | + |
| 97 | +Most projects are developed through some version control system (VCS) and then, |
| 98 | +in other ecosystems, uploaded to the package repository. |
| 99 | +This means there are two accounts that could be compromised, the VCS host and |
| 100 | +the package repository, the latter of which is used more rarely and more likely |
| 101 | +to be overlooked. |
| 102 | +It also means it’s easier to hide malicious code in the version uploaded to the |
| 103 | +repository, especially if the source is routinely modified as part of the |
| 104 | +upload, for example to minimize it. |
| 105 | + |
| 106 | +In Go, there is no such thing as a package repository account. |
| 107 | +The import path of a package embeds the information that `go mod download` |
| 108 | +[needs in order to fetch its |
| 109 | +module](https://pkg.go.dev/cmd/go#hdr-Remote_import_paths) directly from the |
| 110 | +VCS, where tags define versions. |
| 111 | + |
| 112 | +We do have the [Go Module Mirror](https://go.dev/blog/module-mirror-launch), but |
| 113 | +that’s only a proxy. |
| 114 | +Module authors don’t register an account and don’t upload versions to the proxy. |
| 115 | +The proxy uses the same logic that the `go` tool uses (in fact, the proxy runs |
| 116 | +`go mod download`) to fetch and cache a version. |
| 117 | +Since the Checksum Database guarantees that there can be only one source tree |
| 118 | +for a given module version, everyone using the proxy will see the same result as |
| 119 | +everyone bypassing it and fetching directly from the VCS. |
| 120 | +(If the version is not available anymore in the VCS or if its contents changed, |
| 121 | +fetching directly will lead to an error, while fetching from the proxy might |
| 122 | +still work, improving availability and protecting the ecosystem from [“left-pad” |
| 123 | +issues](https://blog.npmjs.org/post/141577284765/kik-left-pad-and-npm).) |
| 124 | + |
| 125 | +Running VCS tools on the client exposes a pretty large attack surface. |
| 126 | +That’s another place the Go Module Mirror helps: the `go` tool on the proxy runs |
| 127 | +inside a robust sandbox and is configured to support every VCS tool, while |
| 128 | +[the default is to only support the two major VCS |
| 129 | +systems](https://go.dev/ref/mod#vcs-govcs) (git and Mercurial). |
| 130 | +Anyone using the proxy can still fetch code published using off-by-default VCS |
| 131 | +systems, but attackers can’t reach that code in most installations. |
| 132 | + |
| 133 | + |
| 134 | +## Building code doesn’t execute it |
| 135 | + |
| 136 | +It is an explicit security design goal of the Go toolchain that neither fetching |
| 137 | +nor building code will let that code execute, even if it is untrusted and |
| 138 | +malicious. |
| 139 | +This is different from most other ecosystems, many of which have first-class |
| 140 | +support for running code at package fetch time. |
| 141 | +These “post-install” hooks have been used in the past as the most convenient way |
| 142 | +to turn a compromised dependency into compromised developer machines, and to |
| 143 | +[worm](https://en.wikipedia.org/wiki/Computer_worm) through module authors. |
| 144 | + |
| 145 | +To be fair, if you’re fetching some code it’s often to execute it shortly |
| 146 | +afterwards, either as part of tests on a developer machine or as part of a |
| 147 | +binary in production, so lacking post-install hooks is only going to slow down |
| 148 | +attackers. |
| 149 | +(There is no security boundary within a build: any package that contributes to a |
| 150 | +build can define an `init` function.) |
| 151 | +However, it can be a meaningful risk mitigation, since you might be executing a |
| 152 | +binary or testing a package that only uses a subset of the module’s |
| 153 | +dependencies. |
| 154 | +For example, if you build and execute `example.com/cmd/devtoolx` on macOS there |
| 155 | +is no way for a Windows-only dependency or a dependency of |
| 156 | +`example.com/cmd/othertool` to compromise your machine. |
| 157 | + |
| 158 | +In Go, modules that don’t contribute code to a specific build have no security |
| 159 | +impact on it. |
| 160 | + |
| 161 | + |
| 162 | +## “A little copying is better than a little dependency” |
| 163 | + |
| 164 | +The final and maybe most important software supply chain risk mitigation in the |
| 165 | +Go ecosystem is the least technical one: Go has a culture of rejecting large |
| 166 | +dependency trees, and of preferring a bit of copying to adding a new dependency. |
| 167 | +It goes all the way back to one of the Go proverbs: [“a little copying is better |
| 168 | +than a little dependency”](https://youtube.com/clip/UgkxWCEmMJFW0-TvSMzcMEAHZcpt2FsVXP65). |
| 169 | +The label “zero dependencies” is proudly worn by high-quality reusable Go |
| 170 | +modules. |
| 171 | +If you find yourself in need of a library, you’re likely to find it will not |
| 172 | +cause you to take on a dependency on dozens of other modules by other authors |
| 173 | +and owners. |
| 174 | + |
| 175 | +That’s enabled also by the rich standard library and additional modules (the |
| 176 | +`golang.org/x/...` ones), which provide commonly used high-level building blocks |
| 177 | +such as an HTTP stack, a TLS library, JSON encoding, etc. |
| 178 | + |
| 179 | +All together this means it’s possible to build rich, complex applications with |
| 180 | +just a handful of dependencies. |
| 181 | +No matter how good the tooling is, it can’t eliminate the risk involved in |
| 182 | +reusing code, so the strongest mitigation will always be a small dependency |
| 183 | +tree. |
0 commit comments