diff --git a/README.md b/README.md index d78bc23f..bd96d0e2 100644 --- a/README.md +++ b/README.md @@ -17,15 +17,104 @@ This project ensures that the Kubernetes control plane responds appropriately to - Webhook feature to send shutdown or restart notification messages - Unit & Integration Tests +## Differences from v1 + +The first major version of AWS Node Termination Handler (NTH) originally operated as a daemonset deployed to every desired node in the cluster (aka IMDS Mode); later, we added the option to deploy a single pod which read events for the entire cluster from an SQS queue (aka Queue Processor Mode). Both heavily utilized Helm for configuration, and changing configuration meant updating the deployment. + +This second major version of NTH aims to refine the Queue Processor Mode. Only a single pod is deployed and configuration is done using a new custom resource called *Terminators*. A *Terminator* contains much of the configuration about where NTH should fetch events, what actions to take for a given event type, filter nodes to act upon, and webhook notifications. Multiple *Terminators* may be deployed, modified, or removed without needing to redeploy NTH itself. + ## Getting Started -### Infrastructure Setup +### 1. Setup Infrastructure -TBD +#### 1.1. Create an IAM OIDC Provider + +Your EKS cluster must have an IAM OIDC Provider. Follow the steps in [Create an IAM OIDC provider for your cluster](https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html) to determine whether your EKS cluster already has an IAM OIDC Provider and, if necessary, create one. + +#### 1.2. Create the NTH Service Account + +##### 1.2.1. Create the IAM Policy + +Download the service account policy template for AWS CloudFormation at https://github.com/aws/aws-node-termination-handler/releases/download/v2.0.0-alpha/infrastructure.yaml + +Then create the IAM Policy by deploying the AWS CloudFormation stack: +```sh +aws cloudformation deploy \ + --template-file infrastructure.yaml \ + --stack-name nth-service-account \ + --capabilities CAPABILITY_NAMED_IAM +``` + +##### 1.2.2. Create the Service Account + +Use either the AWS CLI or AWS Console to lookup the ARN of the IAM Policy for the service account. + +Create the cluster service account using the following command: +```sh +eksctl create iamserviceaccount \ + --cluster \ + --namespace \ + --name "nth-service-account" \ + --role-name "nth-service-account" \ + --attach-policy-arn \ + --role-only \ + --approve +``` + +### 2. Deploy NTH + +Get the ARN of the service account role: +```sh +eksctl get iamserviceaccount \ + --cluster \ + --namespace \ + --name "nth-service-account" +``` + +Add the AWS `eks-charts` helm repository and deploy the chart: +```sh +helm repo add eks https://aws.github.io/eks-charts + +helm upgrade \ + --install \ + nth \ + eks/aws-node-termination-handler-2 \ + --namespace \ + --create-namespace \ + --set aws.region= \ + --set serviceAccount.name="nth-service-account" \ + --set serviceAccount.annotations.eks\\.amazonaws\\.com/role-arn= +``` + +For a full list of inputs see the Helm chart `README.md`. + +### 3. Create a Terminator + +#### 3.1. Create an SQS Queue + +NTH reads events from one or more SQS Queues. If you already have an SQS Queue available then you may skip this step. + +*Note:* Multiple Terminators may read from a single SQS Queue. A Terminator will only delete a message if a matching node was found in the cluster. The SQS Queue's visibility window setting can help to ensure that a message is delivered to only one Terminator at a time. + +You may create your own SQS Queue but an AWS CloudFormation template is available that will create an SQS Queue and commonly used rules for AWS EventBridge. Download from https://github.com/aws/aws-node-termination-handler/releases/download/v2.0.0-alpha/queue-infrastructure.yaml + +```sh +aws cloudformation deploy \ + --template-file queue-infrastructure.yaml \ + --stack-name nth-queue \ + --parameter-overrides \ + ClusterName= \ + QueueName= +``` + +#### 3.2. Define and deploy a Terminator -### Installation and Configuration +You may download a template file from https://github.com/aws/aws-node-termination-handler/releases/download/v2.0.0-alpha/terminator.yaml.tmpl. Edit the file with the required fields and desired configuration. -For a full list of inputs see the Helm chart [README.md](./charts/aws-node-termination-handler-2/README.md). +Deploy the Terminator: +```sh +kubectl apply -f +``` ## Metrics diff --git a/resources/eks-cluster.yaml.tmpl b/resources/eks-cluster.yaml.tmpl index 505ae9dd..dc40ce00 100644 --- a/resources/eks-cluster.yaml.tmpl +++ b/resources/eks-cluster.yaml.tmpl @@ -4,8 +4,6 @@ metadata: name: ${CLUSTER_NAME} region: ${AWS_REGION} version: "1.22" - tags: - karpenter.sh/discovery: ${CLUSTER_NAME} managedNodeGroups: - instanceType: m5.large amiFamily: AmazonLinux2 diff --git a/resources/infrastructure.yaml b/resources/infrastructure.yaml index b6c647eb..cda95a39 100644 --- a/resources/infrastructure.yaml +++ b/resources/infrastructure.yaml @@ -6,6 +6,10 @@ Parameters: ClusterName: Description: EKS Cluster Name Type: String + Default: "" + +Conditions: + IncludeClusterName: !Not [!Equals [!Ref ClusterName, ""]] Resources: ServiceAccountPolicy: @@ -14,7 +18,9 @@ Resources: by the AWS Node Termination Handler controller process to interact with AWS resources. Type: AWS::IAM::ManagedPolicy Properties: - ManagedPolicyName: !Sub "${ClusterName}-serviceaccount" + ManagedPolicyName: !Sub + - "nth${s}-service-account" + - s: !If [IncludeClusterName, !Sub "-${ClusterName}", ""] PolicyDocument: Version: "2012-10-17" Statement: