Skip to content

Event kinds not shown in SQS mode #704

@trutx

Description

@trutx

Describe the bug

When working in IMDS mode, all InterruptionEvent.Kind fields show the interruption event type (SPOT_ITN, SCHEDULED_EVENT, etc). But when running in SQS mode all events kinds are SQS_TERMINATE, which is wrong.

There is confusion in the code about what's an event, a monitor and the operation mode.

Steps to reproduce
Run NTH in SQS mode and wait for any interruption event. The event kind will be SQS_TERMINATE.

Expected outcome
Each event should show its own event type instead of the mode NTH operates in. If in IMDS mode a Spot Interruption event is reported as SPOT_ITN, the same event caught by NTH operating in SQS mode should have the same event kind.

Application Logs
Log sample (anonymized)

2022/10/14 08:14:27 INF Adding new event to the event store event={"AutoScalingGroupName":"my-asg","Description":"Rebalance recommendation event received. Instance i-aaaaaaaaaaaaaaaaaa will be cordoned at 2022-10-14 08:14:26 +0000 UTC \n","EndTime":"0001-01-01T00:00:00Z","EventID":"rebalance-recommendation-event-65646564353331632d306265352d396161352d623aaaaaaaa366664313162636632613233","InProgress":false,"InstanceID":"i-aaaaaaaaaaaaaaaaa","IsManaged":true,"Kind":"SQS_TERMINATE","NodeLabels":null,"NodeName":"ip-1-2-3-4.us-east-2.compute.internal","NodeProcessed":false,"Pods":null,"ProviderID":"aws:///us-east-2c/i-aaaaaaaaaaaaaaaaa","StartTime":"2022-10-14T08:14:26Z","State":""}

Kubernetes events sample (anonymized):

$ kubectl get events --field-selector "source=aws-node-termination-handler"

LAST SEEN   TYPE     REASON           OBJECT                                              MESSAGE
54m         Normal   SQSTermination   node/ip-1-2-3-4.us-east-2.compute.internal   EC2 State Change event received. Instance i-aaaaaaaaaaaaaaaaa went into shutting-down at 2022-10-14 12:26:24 +0000 UTC
20m         Normal   SQSTermination   node/ip-1-2-3-5.us-east-2.compute.internal   EC2 State Change event received. Instance i-bbbbbbbbbbbbbbbbb went into shutting-down at 2022-10-14 12:59:14 +0000 UTC
26m         Normal   SQSTermination   node/ip-1-2-3-6.us-east-2.compute.internal   EC2 State Change event received. Instance i-ccccccccccccccccc went into shutting-down at 2022-10-14 12:54:07 +0000 UTC
59m         Normal   SQSTermination   node/ip-1-2-3-7.us-east-2.compute.internal   EC2 State Change event received. Instance i-ddddddddddddddddd went into shutting-down at 2022-10-14 12:21:33 +0000 UTC

Environment

  • NTH App Version: v1.16.2 and v1.17.3
  • NTH Mode (IMDS/Queue processor): Queue processor
  • OS/Arch: linux/amd64
  • Kubernetes version: v1.21.14-eks-6d3986b
  • Installation method: Helm chart

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions