-
Notifications
You must be signed in to change notification settings - Fork 284
🌱 Add e2e clusterctl upgrade tests #1371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🌱 Add e2e clusterctl upgrade tests #1371
Conversation
✅ Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
|
Hi @lentzi90. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
NOTE: The e2e clusterctl test will not pass with the WIP commit (I couldn't get it to import the container image properly so it gets stuck in ImagePullBackOff). I have however verified that it is working when using the main tag instead of e2e. |
|
/ok-to-test |
63b7795 to
38c8038
Compare
|
/retest |
38c8038 to
e820c94
Compare
|
/retest |
e820c94 to
619d372
Compare
|
/test pull-cluster-api-provider-openstack-e2e-test |
619d372 to
f620925
Compare
|
/test pull-cluster-api-provider-openstack-e2e-test |
f620925 to
e9a5e2e
Compare
|
/test pull-cluster-api-provider-openstack-e2e-test |
1 similar comment
|
/test pull-cluster-api-provider-openstack-e2e-test |
190276a to
f49cee4
Compare
These tests make use of the CAPI e2e framework. The test spec creates a secondary management cluster with older versions of the controllers. A workload cluster is created to test the functionality of the old controllers before they are upgraded. Then clusterctl upgrade is used to upgrade them and the workload cluster is scaled to check that things are working also after the upgrade.
f49cee4 to
0a67fbf
Compare
|
/test pull-cluster-api-provider-openstack-e2e-test |
| source "${REPO_ROOT}/hack/ci/${RESOURCE_TYPE}.sh" | ||
| CONTAINER_ARCHIVE="${ARTIFACTS}/capo-e2e-image.tar" | ||
| SSH_KEY="$(get_ssh_private_key_file)" | ||
| SSH_ARGS="-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o PasswordAuthentication=no" | ||
| CONTROLLER_IP=${CONTROLLER_IP:-"10.0.3.15"} | ||
|
|
||
| make e2e-image | ||
| docker save -o "${CONTAINER_ARCHIVE}" gcr.io/k8s-staging-capi-openstack/capi-openstack-controller:e2e | ||
| scp -i "${SSH_KEY}" ${SSH_ARGS} "${CONTAINER_ARCHIVE}" "cloud@${CONTROLLER_IP}:capo-e2e-image.tar" | ||
| ssh -i "${SSH_KEY}" ${SSH_ARGS} "cloud@${CONTROLLER_IP}" -- sudo chown root:root capo-e2e-image.tar | ||
| ssh -i "${SSH_KEY}" ${SSH_ARGS} "cloud@${CONTROLLER_IP}" -- sudo chmod u=rw,g=r,o=r capo-e2e-image.tar | ||
| ssh -i "${SSH_KEY}" ${SSH_ARGS} "cloud@${CONTROLLER_IP}" -- sudo mv capo-e2e-image.tar /var/www/html/capo-e2e-image.tar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any suggestions on how to improve this?
The point of this is to build the e2e image, save it as an archive and upload it to the controller where it can be fetched for the secondary management cluster. (In the KinD bootstrap cluster, we simply inject the image directly, but we cannot do this for the secondary management cluster that it used for the upgrade test.)
The main issue I have with it is that we build the e2e image twice. Once here and once when running make test-e2e below. For the upgrade test it may be okay but it is completely unnecessary for all the other tests.
|
/assign |
mdbooth
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions quite possibly just for my own understanding.
Can we sort out the commit message of that second commit? I don't want to merge something with WIP in the title!
| .PHONY: e2e-templates | ||
| e2e-templates: ## Generate cluster templates for e2e tests | ||
| e2e-templates: $(addprefix $(E2E_TEMPLATES_DIR)/, \ | ||
| cluster-template-v1alpha5.yaml \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to use this to write v1alpha5 simulator tests!
| # This is only for clusterctl upgrade tests | ||
| - name: v0.6.3 | ||
| value: "https://github.com/kubernetes-sigs/cluster-api-provider-openstack/releases/download/v0.6.3/infrastructure-components.yaml" | ||
| type: url | ||
| contract: v1beta1 | ||
| files: | ||
| - sourcePath: "../data/shared/v1beta1_provider/metadata.yaml" | ||
| - sourcePath: "./infrastructure-openstack/cluster-template.yaml" | ||
| replacements: | ||
| - old: "imagePullPolicy: Always" | ||
| new: "imagePullPolicy: IfNotPresent" | ||
| - old: "--v=2" | ||
| new: "--v=4" | ||
| - old: "--leader-elect" | ||
| new: "--leader-elect=false\n - --sync-period=1m" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In all honesty this remains magic to me. I'll work it out one day when it breaks.
@tobiasgiese @jichenjc @seanschneeweiss Are any of you able to give this stanza meaningful review?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything in particular that you are wondering about? My understanding of this is that it is used to create a repository that clusterctl is then configured to use as the source of truth instead of reaching out to github to check what releases we have. This is mostly useful for not-yet-released versions, but since the same config is also used for upgrade tests, we need to explicitly list the versions that should be "available" to the test.
I didn't think much about the additional config here (replacements and files) but I think they make sense and follow what we already have as well as what CAPI is doing.
| OPENSTACK_VOLUME_TYPE_ALT: "test-volume-type" | ||
| CONFORMANCE_WORKER_MACHINE_COUNT: "5" | ||
| CONFORMANCE_CONTROL_PLANE_MACHINE_COUNT: "1" | ||
| INIT_WITH_KUBERNETES_VERSION: "v1.25.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this used only for the management cluster? We still have the issue that our CI installation image is stuck on 1.18.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for the secondary management cluster. It will be used when rendering the cluster-template here.
I also noticed that the installation image is stuck on 1.18, but we use the CI artifacts script to actually upgrade before kubeadm is run.
| - op: add | ||
| path: /spec/kubeadmConfigSpec/postKubeadmCommands | ||
| value: | ||
| - /usr/local/bin/ci-artifacts-openstack.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this run? Is it on the test runner or on the target? I assume the latter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It runs on the target yes. I'm mimicking what we already use to to get the "CI artifacts".
For that we call GenerateCIArtifactsInjectedTemplateForDebian which injects a script that runs as part of the preKubeadmCommands.
At first I actually tried adding the new script to the ci-artifacts-platform-kustomization.yaml, but that turned out to be harder than expected since it is way too easy to overwrite each others file in those patches. 😬 So I ended up putting it here together with the common patches.
| var _ = Describe("When testing clusterctl upgrades (v0.6=>current) [clusterctl-upgrade]", func() { | ||
| ctx := context.TODO() | ||
| shared.SetEnvVar("USE_CI_ARTIFACTS", "true", false) | ||
| shared.SetEnvVar("DOWNLOAD_E2E_IMAGE", "true", false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this setting an environment variable on the test runner? If so, how does that affect /usr/local/bin/ci-artifacts-openstack.sh which I'm assuming (possibly incorrectly) is executing on the target?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed! It is used when rendering the cluster-template. Both these variables are used in the cluster-template that we use in the tests. You can see the rendered result in _artifacts/templates. Here is the one from the previous (successful) e2e test run on this PR: https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_cluster-api-provider-openstack/1371/pull-cluster-api-provider-openstack-e2e-test/1594601712893562880/artifacts/templates/cluster-template-ci-artifacts.yaml (and here is the complete _artifacts folder from that run for reference).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to have to look at the template renderer. I didn't think it would support bash syntax like ${DOWNLOAD_E2E_IMAGE:=false}.
If that question itself reveals my fundamental misunderstanding, please can you enlighten me?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The substitution language is linked from the clusterctl documentation here: https://cluster-api.sigs.k8s.io/clusterctl/commands/generate-yaml.html
9c8b60c to
40f1a33
Compare
|
I expect the e2e tests to fail now on this PR until #1390 and kubernetes/test-infra#28082 are merged. This is because of resource constraints. We cannot run the clusterctl-upgrade test in parallel with other tests. |
Done ✔️ |
|
/retest |
|
/test pull-cluster-api-provider-openstack-e2e-full-test |
|
/hold cancel |
|
/lgtm |
|
I'd like to wait until I get my head round the /hold |
|
/hold cancel |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lentzi90, mdbooth The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
This adds e2e clusterctl upgrade tests from the CAPI e2e framework. Having a clusterctl upgrade test is very important for verifying upgrades between API versions.
The CAPI clusterctl upgrade test spec is a bit special since it creates not just a workload cluster, but a secondary management cluster also. The flow goes like this:
The reason for the secondary management cluster is to avoid any issues with parallel jobs requiring different versions, so the bootstrap cluster can be shared as normal.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close the issue(s) when PR gets merged):Part of #1363 but I would want to add some more tests before closing it.
Special notes for your reviewer:
Note that this is only needed if we want to test the image built by CI (for example on PRs). If it is ok to just take the
maintagged image, then we can drop this.TODOs:
/hold