-
Notifications
You must be signed in to change notification settings - Fork 274
Description
Describe the bug
When working in IMDS mode, all InterruptionEvent.Kind
fields show the interruption event type (SPOT_ITN
, SCHEDULED_EVENT
, etc). But when running in SQS mode all events kinds are SQS_TERMINATE
, which is wrong.
There is confusion in the code about what's an event, a monitor and the operation mode.
Steps to reproduce
Run NTH in SQS mode and wait for any interruption event. The event kind will be SQS_TERMINATE
.
Expected outcome
Each event should show its own event type instead of the mode NTH operates in. If in IMDS mode a Spot Interruption event is reported as SPOT_ITN
, the same event caught by NTH operating in SQS mode should have the same event kind.
Application Logs
Log sample (anonymized)
2022/10/14 08:14:27 INF Adding new event to the event store event={"AutoScalingGroupName":"my-asg","Description":"Rebalance recommendation event received. Instance i-aaaaaaaaaaaaaaaaaa will be cordoned at 2022-10-14 08:14:26 +0000 UTC \n","EndTime":"0001-01-01T00:00:00Z","EventID":"rebalance-recommendation-event-65646564353331632d306265352d396161352d623aaaaaaaa366664313162636632613233","InProgress":false,"InstanceID":"i-aaaaaaaaaaaaaaaaa","IsManaged":true,"Kind":"SQS_TERMINATE","NodeLabels":null,"NodeName":"ip-1-2-3-4.us-east-2.compute.internal","NodeProcessed":false,"Pods":null,"ProviderID":"aws:///us-east-2c/i-aaaaaaaaaaaaaaaaa","StartTime":"2022-10-14T08:14:26Z","State":""}
Kubernetes events sample (anonymized):
$ kubectl get events --field-selector "source=aws-node-termination-handler"
LAST SEEN TYPE REASON OBJECT MESSAGE
54m Normal SQSTermination node/ip-1-2-3-4.us-east-2.compute.internal EC2 State Change event received. Instance i-aaaaaaaaaaaaaaaaa went into shutting-down at 2022-10-14 12:26:24 +0000 UTC
20m Normal SQSTermination node/ip-1-2-3-5.us-east-2.compute.internal EC2 State Change event received. Instance i-bbbbbbbbbbbbbbbbb went into shutting-down at 2022-10-14 12:59:14 +0000 UTC
26m Normal SQSTermination node/ip-1-2-3-6.us-east-2.compute.internal EC2 State Change event received. Instance i-ccccccccccccccccc went into shutting-down at 2022-10-14 12:54:07 +0000 UTC
59m Normal SQSTermination node/ip-1-2-3-7.us-east-2.compute.internal EC2 State Change event received. Instance i-ddddddddddddddddd went into shutting-down at 2022-10-14 12:21:33 +0000 UTC
Environment
- NTH App Version:
v1.16.2
andv1.17.3
- NTH Mode (IMDS/Queue processor): Queue processor
- OS/Arch:
linux/amd64
- Kubernetes version:
v1.21.14-eks-6d3986b
- Installation method: Helm chart