Skip to content

Conversation

@yastij
Copy link
Member

@yastij yastij commented Aug 29, 2019

What this PR does / why we need it: This PR introduces a load balancing API to implement in-tree load balacing providers. This PR ships AWS as a first provider

Which issue(s) this PR fixes : Fixes #468

Special notes for your reviewer:

what is missing

  • updating the manifest generation to include the AWS credentials secret (secret object + volumeMount for the manager), loadbalancerRef kustomization and an AWSLoadBalancer resource
  • add test coverage for this (e2e + mock aws api to unit test the controller)

/assign @akutz
/cc @andrewsykim

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

Release note:

introduce a load balancing API. AWS is the first provider to be implemented for CAPV

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Aug 29, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: yastij
To complete the pull request process, please assign akutz
You can assign the PR to them by writing /assign @akutz in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 9, 2019
@yastij yastij force-pushed the vmc-v1a2 branch 3 times, most recently from 8910a60 to 7dbaea2 Compare September 9, 2019 20:17
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 9, 2019
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 9, 2019
@yastij yastij force-pushed the vmc-v1a2 branch 2 times, most recently from 450b125 to 85f60e9 Compare September 10, 2019 10:55
@yastij
Copy link
Member Author

yastij commented Sep 11, 2019

/retest

@akutz
Copy link
Contributor

akutz commented Sep 11, 2019

Hi @yastij,

I started my review yesterday, but didn't have time to complete it. One thing I noticed was that the types are in ./api/v1alpha2/cloud. That was just a suggestion about how to contain types in a specific directory -- it wasn't guidance to use that directory. I think the types should fall into a structure similar to something like this:

Example 1

├── api
│   └── v1alpha2
│       ├── cloud
│       └── load-balancer
│           └── vmc
│               └── aws

or

Example 2

├── api
│   ├── load-balancer
│   │   └── v1alpha1
│   │       └── vmc
│   │           └── aws
│   └── vsphere
│       └── v1alpha2
│           └── cloud

Now obviously the second pattern would require not insignificant refactoring since it changes the location of the existing vSphere types, but, the second pattern does adhere to the Kubebuilder recommended guidelines for multi-group projects.

Thoughts?

Copy link
Contributor

@akutz akutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yastij,

I love this PR! Thank you so, so, SO much for this work. I am truly sorry I reviewed this in the morning when I'm at my more attentive :)

/hold

// LoadBalancerConfig describes the supported Load balancer providers
type LoadBalancerConfig struct {
// AwsProvider specifies the information needed to run VMC on AWS
AwsProvider *AwsProviderSpec `json:"awsProvider,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yastij,

Any field that is omitempty should also have a godoc of // +optional according to the OpenAPI guidelines.

// VpcID is the id of the VPC used to create loadBalancers
VpcID string `json:"vpcID"`

// Subnets is the list of subnets where
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yastij,

The godoc for Subnets is incomplete. Also, maybe rename the field to SubnetIDs or SubnetARNs?

// LoadBalancerFinalizer allows to clean up the load balancer
// associated with VSphereCluster before removing it from the
// API server.
LoadBalancerFinalizer = "loadbalancer.infrastructure.cluster.x-k8s.io"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yastij,

There could technically be multiple load balancers, right? Should the LB finalizer be specific to the type of LB used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, the finalizer is added to the cluster object. The cluster object only supports one provider.


// LoadBalancerConfiguration holds a provider-specific configuration to provision
// a Load balancer as a control plane endpoint
LoadBalancerConfiguration *cloud.LoadBalancerConfig `json:"loadBalancerConfiguration,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yastij,

Any field that is omitempty should also have a godoc of // +optional according to the OpenAPI guidelines. The reason the CloudProviderConfiguration doesn't is because it's not a pointer, and thus is never technically empty.

@@ -0,0 +1,142 @@
package controllers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yastij,

I'm wondering if the load balancer controller shouldn't be in a new sub-directory:

├── controllers
    └── load-balancer

I'm just thinking of things to make it easier to move the work in this PR to a separate module or repository at a later date. If we're already importing the LB types/controller from inside this repo, it makes it easier to move it later since we'll just rewrite imports.

}

func (svc *Service) reconcileLoadBalancer(clusterName string, subnets []string) (*string, *string, error) {
var loadBalancerArn *string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yastij,

Please simplify as:

var (
	loadBalancerArn *string
	loadBalancerDNS *string
	describeLoadBalancersInput = &elbv2.DescribeLoadBalancersInput{
		Names: []*string{aws.String(generateELBName(clusterName))},
	}
)

}

func (svc *Service) reconcileListeners(loadBalancerArn *string, targetGroupArn *string) (*int64, error) {
var listenerPort *int64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yastij,

Please simplify as:

var (
	listenerPort *int64
	describeListnerInput = &elbv2.DescribeListenersInput{
		LoadBalancerArn: loadBalancerArn,
	}
)

}

func (svc *Service) deleteTargetGroup(clusterName string) error {
describeTargetGroupInput := &elbv2.DescribeTargetGroupsInput{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yastij,

Please simplify as:

describeTargetGroupInput := &elbv2.DescribeTargetGroupsInput{
	Names: []*string{aws.String(clusterName+"-controlPlane")},
}

Also, please note that I recommended above to extract the building of this name into a distinct function.


func (svc *Service) reconcileTargetGroup(clusterName string, vpcID string, controlPlaneIPs []string) (*string, error) {
describeTargetGroupInput := &elbv2.DescribeTargetGroupsInput{}
describeTargetGroupInput.Names = append(describeTargetGroupInput.Names, aws.String(clusterName+"-controlPlane"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yastij,

I've seen aws.String(clusterName+"-controlPlane") at least twice. Due to the fact that this is required to find a resource, let's extract this into a function:

// GetTargetGroupNameForCluster returns the name of a target group for the provided cluster.
func GetTargetGroupNameForCluster(clusterName string) *string {
	return aws.String(clusterName+"-controlPlane")
}

return err
}

if err := svc.deleteTargetGroup(clusterName); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yastij,

This could be simplified as:

return svc.deleteTargetGroup(clusterName)

Or please feel free to leave as-is if you think we may include additional logic beneath this call one day.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 11, 2019
@yastij yastij force-pushed the vmc-v1a2 branch 4 times, most recently from c0e10a0 to 90a023e Compare September 16, 2019 14:18
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 18, 2019
@yastij yastij force-pushed the vmc-v1a2 branch 2 times, most recently from 9b7c7a3 to cbb90be Compare September 18, 2019 10:47
@yastij
Copy link
Member Author

yastij commented Sep 18, 2019

/retest

@yastij yastij force-pushed the vmc-v1a2 branch 6 times, most recently from d2f8fb6 to d8a9817 Compare September 18, 2019 16:47

const (
// AwsProvider is the name of the aws provider
AwsProvider = "aws"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the provider name should be explicit to the LB implementation, not the cloud provider, so this should be "ELB", thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually this should be removed. As the current PR uses Kind to store/retrieve the provider

// +kubebuilder:subresource:status

// AWSLoadBalancer is the schema for the AWS Load balancer API
type AWSLoadBalancer struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the API types be generic and the implementations be provider specific?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Think of this as a KubeadmBootstrap provider. There will be a diff controller / API model for each implementation of the LB.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that there's no common API for loadbalancers, as each provider needs its own information. I don't want us to endup implementing annotations on-top to satisfy each provider's specifics.

Another way could be store the information on-disk/cm implement something Service (e.g. MachineService or MachineLoadbalancer) which would select machines based on a the selector and add them to the load balancing pool. A drawback is that a management cluster is restricted to provision against one Loadbalancing provider. Thoughts @akutz @andrewsykim ?

Copy link
Contributor

@akutz akutz Sep 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's that complicated. Just like each Cluster resource today has a ConfigRef to an Infrastructure provider's Cluster, so to will there be a ConfigRef from a Cluster to a load balancer config, in this case, AWSLoadBalancerConfig.

There's some questions around how to get the IP information from the machines, but that's something we can work out.

The benefit of this decoupled model is that it enables the introduction of a load balancer provider of any kind.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, plus this wouldn't change much the implementation itself. thoughts @andrewsykim ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my thoughts around this was that we introduce a generic LoadBalancer type with a ProviderRef field similar to what @akutz mentioned re: Cluster with a ConfigRef. That way, for each implementation you only have to implement a new ProviderRef and not the entire LoadBalancer API again. Maybe we already agree on this but it's not clear to me yet.

@yastij yastij force-pushed the vmc-v1a2 branch 2 times, most recently from 6f6718f to 794c59c Compare September 24, 2019 14:51
@yastij
Copy link
Member Author

yastij commented Sep 25, 2019

/retest

yastij added 3 commits October 1, 2019 18:25
Signed-off-by: Yassine TIJANI <[email protected]>
Signed-off-by: Yassine TIJANI <[email protected]>
// APIEndpoint represents the endpoint to communicate with the load
// balancer.
// +optional
APIEndpoint APIEndpoint `json:"apiEndpoint,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not doing a full review on this right now (unless you'd like), but one thing I wanted to mention was the API endpoint needs to be in spec, not status. If you have a provider that generates an endpoint (e.g. a DNS name) and the endpoint is not something you can determine by querying, we would lose the data if we store it in status and we're trying to move the resource from one management cluster to another.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ncdc - I'll rebase and ping you for an API review

@k8s-ci-robot
Copy link
Contributor

@yastij: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 8, 2020
@k8s-ci-robot
Copy link
Contributor

@yastij: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-cluster-api-provider-vsphere-verify-fmt 9977128 link /test pull-cluster-api-provider-vsphere-verify-fmt
pull-cluster-api-provider-vsphere-verify-lint 9977128 link /test pull-cluster-api-provider-vsphere-verify-lint
pull-cluster-api-provider-vsphere-verify-vet 9977128 link /test pull-cluster-api-provider-vsphere-verify-vet
pull-cluster-api-provider-vsphere-verify-crds 9977128 link /test pull-cluster-api-provider-vsphere-verify-crds
pull-cluster-api-provider-vsphere-test 9977128 link /test pull-cluster-api-provider-vsphere-test
pull-cluster-api-provider-vsphere-e2e 9977128 link /test pull-cluster-api-provider-vsphere-e2e

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@akutz
Copy link
Contributor

akutz commented Jan 10, 2020

/close

@k8s-ci-robot
Copy link
Contributor

@akutz: Closed this PR.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

jayunit100 pushed a commit to jayunit100/cluster-api-provider-vsphere that referenced this pull request Feb 26, 2020
* Make sure ubuntu gets proper version of cloud-init

* Add Goss test

* Use packer-goss provisioner to execute Goss tests

* Add packer docs

* Add link to packer documentation in README file.

* Add link to packer-goss

* Add Ansible as prerequisite for packer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add support for VMC

5 participants