Skip to content

[helm-oci] ECR auth expires #787

@nalbury

Description

@nalbury

While attempting to set up ECR as an OCI chart repo, we followed the recommended pattern here to configure a Kube secret with the required registry credentials for the OCI repo, but noticed that the source controller only seems to fetch this secret once on boot. This unfortunately means that once the ECR token expires, the source controller needs to be restarted before authentication will work again and the repo/charts can be reconciled.

Example of the state post expiration:

  • I can login to ECR via the helm cli with the data in the kube secret
[root@jumbox ~]# kubectl get secret -n flux-system ecr-auth -o json |jq '.data.".dockerconfigjson"' -r |base64 --decode |jq '.auths."<redacted>.dkr.ecr.us-west-2.amazonaws.com/helm".password' -r |helm registry login -u AWS --password-stdin <redacted>.dkr.ecr.us-west-2.amazonaws.com/helm
Login Succeeded
[root@jumpbox ~]# helm show chart oci://<redacted>.dkr.ecr.us-west-2.amazonaws.com/helm/my-chart |grep apiVersion
apiVersion: v2
  • But if I look at the status of an ECR hosted helm chart there's a chart pull error saying the token has expired
[root@jumpbox ~]# flux get source chart my-chart
NAME            REVISION	SUSPENDED	READY	MESSAGE
my-chart	0.2.0   	False    	False	chart pull error: chart pull error: failed to get chart version for remote reference: GET "https://<redacted>.dkr.ecr.us-west-2.amazonaws.com/v2/helm/my-chart/tags/list": unexpected status code 403: denied: Your authorization token has expired. Reauthenticate and try again.
  • If I restart the source-controller (delete the pod), then the secret is seemingly reloaded on boot and the chart can reconcile again until the newly loaded token has expired
[root@jumpbox ~]# kubectl delete pod -n flux-system source-controller-644c69fbf7-vpczd
pod "source-controller-644c69fbf7-vpczd" deleted
[root@jumpbox ~]# flux get source chart my-chart
NAME            REVISION	SUSPENDED	READY	MESSAGE
my-chart	0.2.0   	False    	True 	pulled 'my-chart' chart with version '0.2.0'

I know the recommended pattern linked above is from the documentation for the image automation controllers, so wondering if the source-controller is supposed to operate in the same way? It was mentioned here that some caching may be at play.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/ociOCI related issues and pull requests

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions