Skip to content

storage: change Artifact checksum to SHA256 #423

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 10, 2021

Conversation

hiddeco
Copy link
Member

@hiddeco hiddeco commented Aug 9, 2021

Addresses part of fluxcd/flux2#1707. Other usages of SHA1 (e.g. within the BucketReconciler) have been left untouched as they will be already addressed in e.g. #412.

⚠️ target branch of the PR is reconcilers-dev.

@johngmyers
Copy link

This is an incompatible change: anything depending on the existing field's containing a SHA-1 would be broken.

Adherence to apimachinery rules would require a new field and maintaining a SHA-1 in Checksum as long as the v1beta1 API is still supported.

@hiddeco
Copy link
Member Author

hiddeco commented Aug 10, 2021

This is an incompatible change: anything depending on the existing field's containing a SHA-1 would be broken.

The change is made to a development branch as clearly documented in the PR description.

Adherence to apimachinery rules would require a new field and maintaining a SHA-1 in Checksum as long as the v1beta1 API is still supported.

This branch will eventually replace the v1beta1 API, as there have been more changes in this branch related to the API.

@stefanprodan
Copy link
Member

stefanprodan commented Aug 10, 2021

The checksum field is not used by any of the GitOps Toolkit controllers nor is it user-facing. I see no issues with changing the SHA algorithm in v1beta1.

This changes the format of the Artifact checksum from SHA1 to SHA256 to
mitigate chosen-prefix and length extension attacks, and ensures it can
be used to secure content against malicious modifications.

Source consumers (including our own {kustomize,helm}-controllers)
should ensure the SHA256 of a downloaded artifact matches the
advertised checksum before making use of it.

Signed-off-by: Hidde Beydals <[email protected]>
@hiddeco hiddeco force-pushed the storage-sha256-checksum branch from 548910c to e79b573 Compare August 10, 2021 08:30
@hiddeco hiddeco requested a review from stefanprodan August 10, 2021 08:42
Copy link
Member

@stefanprodan stefanprodan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks @hiddeco

PS. We should add a helper function to fluxcd/pkg/runtime that downloads artifacts, checks the sha then extracts the files in tmp. This function should replace: https://github.com/fluxcd/kustomize-controller/blob/main/controllers/kustomization_controller.go#L429

@hiddeco hiddeco merged commit 5d43bcc into reconcilers-dev Aug 10, 2021
@hiddeco hiddeco deleted the storage-sha256-checksum branch August 10, 2021 08:45
@hiddeco
Copy link
Member Author

hiddeco commented Aug 10, 2021

PS. We should add a helper function to fluxcd/pkg/runtime that downloads artifacts, checks the sha then extracts the files in tmp. This function should replace: https://github.com/fluxcd/kustomize-controller/blob/main/controllers/kustomization_controller.go#L429

I think this would introduce a challenging dependency chain (runtime -> source API -> source-controller -> runtime). Instead, either the Artifact should be made generic and decoupled from the source API, or the helper utility should be introduced somewhere within this code base.

@hiddeco
Copy link
Member Author

hiddeco commented Aug 10, 2021

From short discussion in private: idea is to introduce a <Something>Download(url, checksum string) (string, error) later on. Which would prevent coupling to the API, and not assume having to untar.

@johngmyers
Copy link

I'm having trouble reconciling the statements here about changing/replacing the v1beta1 API against the statement in the August 2021 update:

The GOTK Custom Resource Definitions which are at v1beta1 and v2beta1 and their controllers are considered stable and production ready. Going forward, breaking changes to the beta CRDs will be accompanied by a conversion mechanism.

On a separate topic, I've had problems with source-controller OOMing because it downloads multi-GB files (such as our Helm repository index) into memory, causing it to exceed the pod's memory limit. I was starting to look into fixing it to stream directly to disk, but it appears there's a lot of change going in that area right now.

@hiddeco
Copy link
Member Author

hiddeco commented Aug 11, 2021

I'm having trouble reconciling the statements here about changing/replacing the v1beta1 [...]

It will be changed to v1beta2 (or something else) later on, with the current changes, the ones that break contract are all related to machine parts of the API, and not the declarative configuration.

On a separate topic, I've had problems with source-controller OOMing because it downloads multi-GB files (such as our Helm repository index) into memory, causing it to exceed the pod's memory limit.

Yes, this is a known issue, but challenging at the same time as you will run into this exact same issue when e.g. a chart revision is looked up in the index. So just preventing it from being loaded into memory when it is downloaded does not solve the whole issue.

Contributions to improve the situation are however always welcome, and you have two options:

  1. You contribute to main, and I'll happily backport your fixes to the reconcilers-dev branch.
  2. You contribute on top of the helmrepository-reconciler branch from Rewrite HelmRepositoryReconciler to new standards #413, which contains all refactored logic and is not expected to change much more for now (besides additional tests and a double check on emitted Events).

@johngmyers
Copy link

Okay, but strict adherence to apimachinery rules would require storing the SHA1 in an annotation of the v1beta2 resource so that a v1beta1 resource could round-trip. The SHA256 could either go into an annotation or new field in the v1beta1 resource.

If the project wants to declare parts of the API as outside of the stability guarantee, that's their decision to make. I don't have a use for the SHA1.

For me, verifying a secure hash is much higher priority than saving memory. That would seem to depend on the helmrepository-reconciler branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants