Skip to content

[Bug] New nodegroup creation failed, when using new EKS feature to change control plane subnets on eksctl created cluster. #7538

@jiqun9393

Description

@jiqun9393

What were you trying to accomplish?

Create a nodegroup in EKS cluster created by eksctl.

What happened?

My cluster was created by eksctl a while ago, when upgrading/configuring cluster later, I did not use eksctl.
One of the configuration updates I did includes VPC configuration updates, I changed one of my control plane's subnet.

When I tried to create a new nodegroup after this update, nodegroup creation failed at very early phase, before stack could be created.
Error: operation error EC2: DescribeSubnets, https response error StatusCode: 400, RequestID: xxxxx, api error InvalidSubnetID.NotFound: The subnet ID 'subnet-xxxxx' does not exist

The subnet ID in error message is not the subnet I used for nodegroup, but the previous subnet for control plane, which I deleted after the configuration change.

How to reproduce it?

Change control plane subnets, then delete the old control plane subnet.
Create new nodegroup with following yaml config file:

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: xxx
  region: cn-north-1
managedNodeGroups:
  - name: xxx
    instanceType: t4g.2xlarge
    desiredCapacity: 2
    minSize: 1
    maxSize: 22
    volumeSize: 100
    subnets: ["subnet-123456"]
    securityGroups:
      attachIDs: ['sg-123456']
    ssh:
      allow: true
      publicKeyName:  xxx
    privateNetworking: true
    iam:
      attachPolicyARNs:
        - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
        - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
        - arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly
        - arn:aws:iam::aws:policy/ElasticLoadBalancingFullAccess
      withAddonPolicies:
        autoScaler: true

Logs

Error: operation error EC2: DescribeSubnets, https response error StatusCode: 400, RequestID: xxxxx, api error InvalidSubnetID.NotFound: The subnet ID 'subnet-xxxxx' does not exist

Anything else we need to know?

Using cloudtrail and going through the source code, I can see that eksctl has a logic of describing control plane subnets first, from the cluster stack output.
As I changed subnets outside eksctl, the stack output did not change to reflect the changes.

I see that I may use eksctl utils update-cluster-vpc-config to sync between the stack and actual cluster configuration. But as this is a production environment I do not want to put it at risk of any unwanted changes.
Finally I found workaround by updating only the stack outputs.

Is there any way of skipping the logic of describing stack output for nodegroup creation? I hope to see that eksctl will take this new EKS feature of VPC configuration change into account.

Versions

$ eksctl info
eksctl version: 0.170.0
kubectl version: v1.25.4
OS: linux

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions