Skip to content

Conversation

@sbueringer
Copy link
Member

As the one before. This PR integrates code from CAPA. This PR will generate & store certificates
in the Cluster CRD. With this we're now able to set certifcates on every control plane node (multi-node support will be added in a later PR). We're also able to generate a kubeconfig whenever we won't without ssh'ing on a control plane node

What this PR does / why we need it:
Another step towards multi-node control plane support. Without this we would have to copy certificates between control plane nodes

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Implements part of #382

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 11, 2019
@k8s-ci-robot k8s-ci-robot requested review from flaper87 and krousey July 11, 2019 17:57
@k8s-ci-robot
Copy link
Contributor

Hi @sbueringer. Thanks for your PR.

I'm waiting for a kubernetes-sigs or kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jul 11, 2019

// ReconcileCertificates generate certificates if none exists.
func (s *Service) ReconcileCertificates(clusterName string, clusterProviderSpec *v1alpha1.OpenstackClusterProviderSpec) error {
if !clusterProviderSpec.CAKeyPair.HasCertAndKey() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jichenjc Your comment from the PR on the other repository:

from code, looks like you are trying to generate those keys? if it's generated, then how do you store them and reuse? looks to me the Spec are the desired state but if you didn't give key at beginning, it's not desired state?? just curious ..

Copy link
Member Author

@sbueringer sbueringer Jul 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jichenjc The logic is the following:

If no certificates are set in the Cluster CRD, ReconcileCertificates will generate them and add them to the Cluster CRD. But it's also possible (if wanted) to set the certificates when creating the Cluster CRD.

The main reasoning behind this is that we need a place to store certificates so we can give the same certificates to multiple control plane nodes, without copying them from one host to another. Also it's now possible to generate a kubeconfig based on the Cluster CRD alone.

I guess the Cluster CRD might not be the best place for the certs going forward but it's straight forward. But for now it's the same as in CAPA and we always can move them in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the logic here :), the question I have originally is I didn't see some code like 'save' ' store' the Cert and key ...so I assume kubeconfig will be the place to hold those stuff..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it :). That's not really intuitive but the whole spec & status are saved here:

But I see the defer statement is way to late. I moved it up in another PR. But I'll do it now in this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should only be a matter of cluster actuator reconciling the certs. As soon as that's done the machine actuator can use them. As long as the certs are not in the Cluster CRD yet the machine actuator has to wait (if not it will render empty certs in the userData)

@sbueringer sbueringer force-pushed the pr-implement-reconcile-certs branch from d0bb419 to 4e393d7 Compare July 11, 2019 18:04
@k8s-ci-robot
Copy link
Contributor

@sbueringer: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/ok-to-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sbueringer
Copy link
Member Author

/assign @chrigl
/assign @jichenjc

@jichenjc
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 12, 2019
@jichenjc
Copy link
Contributor

I'll try this on my env and let you know :)

@sbueringer
Copy link
Member Author

I'll try this on my env and let you know :)

Thx :). I could only verify it with CoreOS in my environment

@sbueringer
Copy link
Member Author

I'll rebase on top of master and also test it with clusterctl

This is heavily inspired bei CAPA. We now reconcile and store certificates
in the Cluster CRD. Thus it's possible to distribute the same CAs over all
control plane nodes (as soon as multi-node control plane supported is
implemented). We also don't have to ssh on the/a control plane node to get
a valid kubeconfig. We now can just generate one from the CA.
… controllerClient

This enables us to run the controller outside the Workload Cluster, e.g. in a Management
Cluster.
Now it's possible to use a local userdata folder. This is mostly useful
for development to avoid updating the user data Secrets all the time.
@sbueringer sbueringer force-pushed the pr-implement-reconcile-certs branch from b875d2e to ee1e643 Compare July 13, 2019 08:31
@jichenjc
Copy link
Contributor

hi @sbueringer I tried this PR my understanding (from above comments) is this can be auto generated
but I got this error ,I guess I need more time to know how to add CA stuffs...

E0715 01:57:11.448579 1 actuator.go:323] Machine error openstack-master-8w7cr: error creating Openstack instance: CA cert material in the ClusterProviderSpec is missing cert/key
W0715 01:57:11.448610 1 machine_controller.go:229] Failed to create machine "openstack-master-8w7cr": error creating Openstack instance: CA cert material in the ClusterProviderSpec is missing cert/key

@sbueringer
Copy link
Member Author

hi @sbueringer I tried this PR my understanding (from above comments) is this can be auto generated
but I got this error ,I guess I need more time to know how to add CA stuffs...

E0715 01:57:11.448579 1 actuator.go:323] Machine error openstack-master-8w7cr: error creating Openstack instance: CA cert material in the ClusterProviderSpec is missing cert/key
W0715 01:57:11.448610 1 machine_controller.go:229] Failed to create machine "openstack-master-8w7cr": error creating Openstack instance: CA cert material in the ClusterProviderSpec is missing cert/key

Hi, That error is kind of expected. But only a few times. So the machine actuator should wait until the cluster actuator reconciled the certificates. After that is done the machine should be created. I tried the whole clusterctl flow with CoreOS and it worked.

@jichenjc
Copy link
Contributor

perfect, I will wait for longer time , I might be too rushy here.. thanks for the detail info~

@sbueringer
Copy link
Member Author

Yup. But it shouldn't take to long. Maybe the problem is that something fails in the Cluster actuator and that's why the defer is never triggered

@sbueringer
Copy link
Member Author

@jichenjc Commit regarding defer storeCluster is pushed

kind: "OpenstackProviderSpec"
flavor: m1.medium
image: <Image Name>
sshUserName: <SSH Username>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep those for debug purpose.... or at least show a way to user how to debug issues

and sshUsername and keyname below should be a group... remove or keep them both

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure.

keyname is used to inject the given keypair. So that's absolutely necessary to jump on the host later, without it, it won't work. sshUserName would be only to document to the user inside the Machine resource what the name of his os user is. I'm not sure if he already knows that and if it's strictly necessary to put this inside the Machine resource

Copy link
Member Author

@sbueringer sbueringer Jul 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I also don't have very strong opinions on keeping or deleting them both. I think deleting both is not really an option because without keyname we don't inject the key which makes it possible in the first place to ssh onto the host.

I guess I would go with documenting how to ssh onto the host, because leaving the property inside the machine resource implies to me that it's required by the Cluster API controllers in some way, which it isn't.

@jichenjc
Copy link
Contributor

I confirmed ubuntu test can pass test (though some other minor changes but not related) :)

so lgtm if above comments addressed... thanks~

@jichenjc
Copy link
Contributor

ok, I think a follow up document update will be needed.
I can do that update accordingly

@jichenjc
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 16, 2019
@jichenjc
Copy link
Contributor

give other folks a chance to take a look

@sbueringer
Copy link
Member Author

ok, I think a follow up document update will be needed.
I can do that update accordingly

That would be nice. Thx!

give other folks a chance to take a look

Yup of course. I pinged @chrigl already. I cannot approve it anyway :)

@jichenjc
Copy link
Contributor

/approve

we do have chance to fix minor issues later on, for now, since both of us tested, I think we can go with it

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jichenjc, sbueringer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 17, 2019
@k8s-ci-robot k8s-ci-robot merged commit 5524321 into kubernetes-sigs:master Jul 17, 2019
parts := strings.Split(result, "STARTFILE")
if len(parts) != 2 {
return "", nil
server := fmt.Sprintf("https://%s:6443", ip)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might be the root cause for #399

@sbueringer sbueringer deleted the pr-implement-reconcile-certs branch July 27, 2019 12:48
pierreprinetti pushed a commit to shiftstack/cluster-api-provider-openstack that referenced this pull request Apr 22, 2024
* Reconcile certificates and store them in the cluster crd

This is heavily inspired bei CAPA. We now reconcile and store certificates
in the Cluster CRD. Thus it's possible to distribute the same CAs over all
control plane nodes (as soon as multi-node control plane supported is
implemented). We also don't have to ssh on the/a control plane node to get
a valid kubeconfig. We now can just generate one from the CA.

* Generate a kubeconfig for the Workload cluster instead of reusing the controllerClient

This enables us to run the controller outside the Workload Cluster, e.g. in a Management
Cluster.

* Add options for local user data

Now it's possible to use a local userdata folder. This is mostly useful
for development to avoid updating the user data Secrets all the time.

* added vaildate certificates

* Removed sshUserName from templates, moved defer storeCluster up
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants