Skip to content

Crash with exit code 2 in Queue Processing mode and IMDSv2 enabled on EC2 #732

@peteroruba

Description

@peteroruba

Describe the bug
Process terminates with exit code 2

Steps to reproduce
Run in Queue Processing mode and enable IMDSv2 on EC2 instance

Expected outcome
IMDS endpoint is being ignored and NTH processes events via SQS Queue.

Application Logs
2022/11/28 14:26:29 WRN Deprecated argument "managed-asg-tag" and the replacement argument "managed-tag" was provided. Using the newer argument "managed-tag"
2022/11/28 14:26:29 WRN Deprecated argument "check-asg-tag-before-draining" and the replacement argument "check-tag-before-draining" was provided. Using the newer argument "check-tag-before-draining"
2022/11/28 14:26:29 INF Starting to serve handler /healthz, port 8080
2022/11/28 14:26:33 INF Metadata response status code: 401. Body:
2022/11/28 14:26:33 INF aws-node-termination-handler arguments:
dry-run: false,
node-name: ip-xx-xx-xx-xx.eu-central-1.compute.internal,
pod-name: aws-node-termination-handler-5c45ff9756-2jlcv,
metadata-url: http://169.254.169.254,
kubernetes-service-host: 172.20.0.1,
kubernetes-service-port: 443,
delete-local-data: true,
ignore-daemon-sets: true,
pod-termination-grace-period: -1,
node-termination-grace-period: 120,
enable-scheduled-event-draining: true,
enable-spot-interruption-draining: true,
enable-sqs-termination-draining: true,
enable-rebalance-monitoring: true,
enable-rebalance-draining: true,
metadata-tries: 3,
cordon-only: false,
taint-node: false,
taint-effect: NoSchedule,
exclude-from-load-balancers: false,
json-logging: false,
log-level: info,
webhook-proxy: ,
webhook-headers: ,
webhook-url: ,
webhook-template: ,
uptime-from-file: ,
enable-prometheus-server: false,
prometheus-server-port: 9092,
emit-kubernetes-events: false,
kubernetes-events-extra-annotations: ,
aws-region: eu-central-1,
queue-url: https://sqs.eu-central-1.amazonaws.com/xxxxxx/node-termination-notification,
check-tag-before-draining: true,
managed-tag: aws-node-termination-handler/managed,
use-provider-id: false,
aws-endpoint: ,

2022/11/28 14:26:33 ERR Unable to fetch metadata from IMDS error="Metadata request received http status code: 401"
2022/11/28 14:26:41 WRN All retries failed, unable to complete the uncordon after reboot workflow error="timed out waiting for the condition"
2022/11/28 14:26:41 INF Started watching for interruption events
2022/11/28 14:26:41 INF Kubernetes AWS Node Termination Handler has started successfully!
2022/11/28 14:26:41 INF Started watching for event cancellations
2022/11/28 14:26:41 INF Started monitoring for events event_type=REBALANCE_RECOMMENDATION
2022/11/28 14:26:41 INF Started monitoring for events event_type=SCHEDULED_EVENT
2022/11/28 14:26:41 INF Started monitoring for events event_type=SQS_TERMINATE
2022/11/28 14:26:41 INF Started monitoring for events event_type=SPOT_ITN
2022/11/28 14:26:55 WRN There was a problem monitoring for events error="There was a problem checking for spot ITNs: Metadata request received http status code: 401" event_type=SPOT_ITN
2022/11/28 14:26:55 WRN There was a problem monitoring for events error="There was a problem checking for rebalance recommendations: Metadata request received http status code: 401" event_type=REBALANCE_RECOMMENDATION
2022/11/28 14:26:59 WRN There was a problem monitoring for events error="Unable to parse metadata response: Metadata request received http status code: 401" event_type=SCHEDULED_EVENT
2022/11/28 14:27:05 WRN There was a problem monitoring for events error="There was a problem checking for spot ITNs: Metadata request received http status code: 401" event_type=SPOT_ITN
2022/11/28 14:27:09 WRN There was a problem monitoring for events error="There was a problem checking for rebalance recommendations: Metadata request received http status code: 401" event_type=REBALANCE_RECOMMENDATION
2022/11/28 14:27:11 WRN There was a problem monitoring for events error="Unable to parse metadata response: Metadata request received http status code: 401" event_type=SCHEDULED_EVENT
2022/11/28 14:27:17 WRN There was a problem monitoring for events error="There was a problem checking for spot ITNs: Metadata request received http status code: 401" event_type=SPOT_ITN
2022/11/28 14:27:21 WRN There was a problem monitoring for events error="There was a problem checking for rebalance recommendations: Metadata request received http status code: 401" event_type=REBALANCE_RECOMMENDATION
2022/11/28 14:27:21 WRN There was a problem monitoring for events error="Unable to parse metadata response: Metadata request received http status code: 401" event_type=SCHEDULED_EVENT
gi2022/11/28 14:27:31 WRN There was a problem monitoring for events error="There was a problem checking for spot ITNs: Metadata request received http status code: 401" event_type=SPOT_ITN
2022/11/28 14:27:31 WRN Stopping NTH - Duplicate Error Threshold hit.
panic: There was a problem checking for spot ITNs: Metadata request received http status code: 401

goroutine 116 [running]:
main.main.func4(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
/node-termination-handler/cmd/node-termination-handler.go:216 +0x649
created by main.main
/node-termination-handler/cmd/node-termination-handler.go:198 +0xd11

Environment

  • NTH App Version: 1.17.3
  • NTH Mode (IMDS/Queue processor): Queue
  • OS/Arch: Linux / AMD64
  • Kubernetes version: 1.22
  • Installation method: Helm

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: BugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions