-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
What steps did you take and what happened:
This is an issue reported in the slack: https://kubernetes.slack.com/archives/C8TSNPY4T/p1667740494784379
Did anyone hit the error that ClusterResourceSet controller applies objects in ClusterResourceSet too early before the Service kubernetes is created? I hit this error once this week, while using ClusterResourceSet to deploy kapp-controller which contains a Service kapp-controller/packaging-api . This service is assigned with the IP “10.96.0.1”, and then creating the Service kubernetes failed due to service IP conflict.
# k logs -n kube-system kube-apiserver-mycluster-controlplane-pl4vn
E1106 12:09:27.196308 1 controller.go:240] unable to sync kubernetes service: Service "kubernetes" is invalid: spec.clusterIPs: Invalid value: []string{"10.96.0.1"}: failed to allocate IP 10.96.0.1: provided IP is already allocated
E1106 12:09:37.197558 1 controller.go:240] unable to sync kubernetes service: Service "kubernetes" is invalid: spec.clusterIPs: Invalid value: []string{"10.96.0.1"}: failed to allocate IP 10.96.0.1: provided IP is already allocated
# k get svc -A
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kapp-controller packaging-api ClusterIP 10.96.0.1 <none> 443/TCP 2d2h
kube-system kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 2d2h
# k get node
NAME STATUS ROLES AGE VERSION
mycluster-controlplane-pl4vn NotReady control-plane 2d3h v1.24.4
mycluster-workergroup1-ccfcz NotReady <none> 2d3h v1.24.4
mycluster-workergroup1-lmx7b NotReady <none> 2d3h v1.24.4
The service object creation timestamp:
# k get svc -n kapp-controller packaging-api -oyaml |grep creationTimestamp
creationTimestamp: "2022-11-04T09:37:14Z"
Seems the CRS controller just gets the remote client for the workload cluster, but does not check if the Service kubernetes in the workload cluster has been created:
https://github.com/kubernetes-sigs/cluster-api/blob/v1.2.7/exp/addons/internal/controllers/clusterresourceset_controller.go#L239-L247
What did you expect to happen:
kapp-controller CRS should be applied successfully
Anything else you would like to add:
We tried to workaround this issue by adding the wait logic before applying CRS objects like this:
err = wlcClient.Get(ctx, apitypes.NamespacedName{
Namespace: metav1.NamespaceDefault,
Name: "kubernetes",
}, &corev1.Service{})
if err != nil && !apierrors.IsNotFound(err) {
return reconcile.Result{}, err
}
if apierrors.IsNotFound(err) {
ctx.Logger.Info("Wait for the Service kubernetes to be created")
return reconcile.Result{RequeueAfter: NormalRequeueTimeout}, nil
}
Environment:
- Cluster-api version: 1.2.7
- minikube/kind version:
- Kubernetes version: (use
kubectl version
): - OS (e.g. from
/etc/os-release
):
/kind bug
[One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]