Increasing memory usage 

**Describe the bug**
We have deployed v1.16.3 of the node termination handler. I noticed that the memory usage is increasing over certain period of time and eventually it reaches the pod memory limit and is OOMKilled. Is there a memory leak somewhere?

![nth-memory-usage](https://user-images.githubusercontent.com/63048111/182585182-af099982-dced-4c21-a192-7cd605b836e9.png)


**Application Logs**
Following are the logs for the aws-node-termination-handler pod. Logs do not have anything erroneous:

2022/07/28 07:15:01 INF Starting to serve handler /metrics, port 9092
2022/07/28 07:15:01 INF Starting to serve handler /healthz, port 8080
2022/07/28 07:15:01 INF Startup Metadata Retrieved metadata={"accountId":"xxxx","availabilityZone":"us-west-2b","instanceId":"i-xxxx","instanceLifeCycle":"on-demand","instanceType":"c6i.4xlarge","localHostname":"xxxx.us-west-2.compute.internal","privateIp":"x.x.x.x","publicHostname":"","publicIp":"","region":"us-west-2"}
2022/07/28 07:15:01 INF aws-node-termination-handler arguments: 
	dry-run: false,
	node-name: xxxx.us-west-2.compute.internal,
	pod-name: aws-node-termination-handler-b56bf578b-79x5m,
	metadata-url: http://abcd,
	kubernetes-service-host: x.x.x.X,
	kubernetes-service-port: 443,
	delete-local-data: true,
	ignore-daemon-sets: true,
	pod-termination-grace-period: -1,
	node-termination-grace-period: 120,
	enable-scheduled-event-draining: false,
	enable-spot-interruption-draining: false,
	enable-sqs-termination-draining: true,
	enable-rebalance-monitoring: false,
	enable-rebalance-draining: false,
	metadata-tries: 3,
	cordon-only: false,
	taint-node: true,
	taint-effect: NoSchedule,
	exclude-from-load-balancers: false,
	json-logging: false,
	log-level: info,
	webhook-proxy: ,
	webhook-headers: <not-displayed>,
	webhook-url: ,
	webhook-template: <not-displayed>,
	uptime-from-file: ,
	enable-prometheus-server: true,
	prometheus-server-port: 9092,
	emit-kubernetes-events: true,
	kubernetes-events-extra-annotations: ,
	aws-region: us-west-2,
	queue-url: https://xxxx,
	check-asg-tag-before-draining: false,
	managed-asg-tag: aws-node-termination-handler/managed,
	use-provider-id: false,
	aws-endpoint: ,
2022/07/28 07:15:01 INF Started watching for interruption events
2022/07/28 07:15:01 INF Kubernetes AWS Node Termination Handler has started successfully!
2022/07/28 07:15:01 INF Started watching for event cancellations
2022/07/28 07:15:01 INF Started monitoring for events event_type=SQS_TERMINATE
2022/07/28 07:20:12 INF Adding new event to the event store event={"AutoScalingGroupName":"xxxx","Description":"EC2 State Change event received. Instance i-xxxx went into shutting-down at 2022-07-28 07:20:11 +0000 UTC \n","EndTime":"0001-01-01T00:00:00Z","EventID":"ec2-state-change-event-xxxx","InProgress":false,"InstanceID":"i-xxxx","IsManaged":true,"Kind":"SQS_TERMINATE","NodeLabels":null,"NodeName":"xxxx.us-west-2.compute.internal","NodeProcessed":false,"Pods":null,"ProviderID":"aws:///us-west-2c/xxxx","StartTime":"2022-07-28T07:20:11Z","State":""}


* Kubernetes version: v1.21 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Increasing memory usage #665

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Increasing memory usage #665

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions