Skip to content

High memory consumption #32

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sharok opened this issue Oct 23, 2015 · 90 comments
Closed

High memory consumption #32

sharok opened this issue Oct 23, 2015 · 90 comments
Assignees
Labels

Comments

@sharok
Copy link

sharok commented Oct 23, 2015

Hello. When I restart codedeploy-agent it takes about 26MB. But, after one deploy it takes 300-350MB. Is it ok or not? Memory doesn't free when deploy is finished. Therefore, I get error about memory allocation on the next build.

Image of proccesses

@amartyag
Copy link
Contributor

amartyag commented Nov 6, 2015

Hi,
Can you please tell us the size of the deployment bundle ? Also, can you try to do the deployment once without the install step and see if the memory still spikes? Thanks

@elijahchancey
Copy link

+1 I'm seeing high memory consumption too. For further info please see Case ID 1557344221

@elijahchancey
Copy link

Any updates on this?

@feverLu
Copy link
Contributor

feverLu commented Apr 14, 2016

Sorry, but what do you mean by Case ID 1557344221?

@thegranddesign
Copy link

I'm seeing CodeDeploy taking 157MB of memory which is crazy high for something that simply waits and polls a server. I can see if maybe while it's doing other things like interacting with the deploy, but once it's done, it should release that memory.

image

@thegranddesign
Copy link

Also FWIW, it's the polling process that's taking up all the memory. The "master" process is sitting at 15MB which is very reasonable.

@talentedmrjones
Copy link

I'm seeing memory continue to grow with each deploy and not be freed. We are currently over 1GB!

total kB 1051392 765580 762328

@jangie
Copy link

jangie commented Jun 13, 2016

+1. I am seeing the polling process taking 1.138g after two days of it being up. One possible compounding issue is that we have two applications that depend on codedeploy for deployment on this given instance, but it still seems excessive.

@jangie
Copy link

jangie commented Jun 22, 2016

For people still experiencing this issue: In our case, this seems to have been resolved for our use case following an update of the ruby version. The ruby versions in question were 2.0.0 p353 (bad memory consumption) and 2.0.0 p598 (much more reasonable memory consumption). While there was also a simultaneous OS update, I believe that this would not have influenced this issue.

@thegranddesign
Copy link

I'm on Ruby 2.3 and this is still an issue.

@michaelsobota
Copy link

+1. We are on Ruby 2.3.1 and experiencing the same behavior.

@michaelsobota
Copy link

Our bundle size is ~80MB -- mem usage prior to a deployment is 28588kb RSS and jumps to 674184kb post deployment, after ~2-3 runs through our pipeline we get OOM failures on the deployment.

28588 766 codedeploy-agent: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller of master 762

674184 766 codedeploy-agent: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller of master 762

@mattlo
Copy link

mattlo commented Dec 22, 2016

I'm also encountering this issue. Been running code deploy for a few months and we're seeing memory creeping on every deploy. As mentioned on #6, I added a few commands when the code deploy succeeds

sudo /etc/init.d/codedeploy-agent stop
sudo /etc/init.d/codedeploy-agent start
echo 3 > sudo /proc/sys/vm/drop_caches

screen shot 2016-12-22 at 11 35 55 am

There is nothing other than the CodeDeploy agent running on the Ruby process.

  • The constant memory was before I cleared the memory from CodeDeploy
  • After dumping the memory (terminating CodeDeploy), I modified appspec.yml config to include the script above on ValidateService, added custom script to dump memory after CI has indicated a successful build.
  • The memory spike is on code deployment, and the memory dips right back down to pre-deploy levels.
  • Thankfully, CodeDeploy Console still seems to think the instance is a successful deploy, so it continues with the rollout config even though the agent stops for a several seconds. CodeDeploy Console will fail sometimes, so you can't use the leverage the ValidateService hook, even if you have a long timeout configuration.

Digging deeper, AWS CodeDeploy must be holding the bundle within memory and not releasing it after the hook. There probably needs to be a modification somewhere in the code base to deallocate it. Aggressively changing the ruby GC is another option, but may not be an easy option for those with Ruby applications.

@jcowley
Copy link

jcowley commented Mar 15, 2017

Seeing this as well. Memory usage continually creeps up. After a dozen or so deploys, the agent is holding onto ~500MB of memory. Eventually, deployments start failing because the deployment scripts can no longer allocate memory.

This seems like a pretty egregious problem to have open for so long. (See issue #6 as well.)

@jcowley
Copy link

jcowley commented Mar 20, 2017

So here's my hacky workaround to the aws-codedeploy-agent memory leak issue. I added a line to schedule a restart of the agent to the end of the last script (e.g. a script that runs from the ValidateService hook in the appspec.yml file):

at -M now + 2 minute <<< $'service codedeploy-agent restart'

Without something like this, builds start failing for me with 'Cannot Allocate Memory' errors in the logs after about 2-3 builds.

Just to be clear, this isn't a "fix" that resolves the issue, so please don't close it. It's just a workaround hack. The real fix might be to ensure that the codedeploy agent is not reading files into memory with every build.

@Raniz85
Copy link

Raniz85 commented Apr 6, 2017

I've taken memory dumps with memdump before and after deployments to help with debugging this.

I monitored the agent with top during deployments and before the first deployment it took about 45 MiB memory, after the first about 80 MiB of memory and after the second about 100 MiB of memory.

See attached dumps below:
pre-deploy.txt.gz
post-deploy.txt.gz
post-2nd-deploy.txt.gz

All of these are very high, I think 45 MiB of memory usage for the agent while idle is too high, the 100 MiB after two deployments is unacceptable - we're running this on t2.nano instances and having the codedeploy agent use more than 20% of available memory means we're seriously looking for alternatives right now to avoid having to upgrade all our servers to t2.micro and doubling the cost.

Dumps were taken by installing rbtrace and memdump and then applying the following patch to /opt/codedeploy-agent/lib/instance_agent/agent/base.rb:

--- base.rb.original    2017-04-06 08:53:36.529223983 +0000
+++ base.rb     2017-04-06 08:53:27.793289345 +0000
@@ -1,5 +1,6 @@
 # encoding: UTF-8
 require 'instance_agent/agent/plugin'
+require 'objspace'
 
 module InstanceAgent
   module Agent
@@ -26,6 +27,10 @@
 
         begin
           perform
+          File.open("/tmp/agent.dump.#{start_time}", 'w') do |io|
+            ObjectSpace.dump_all(output: io)
+          end
+          log(:info, "Stored memory dump in /tmp/agent.dump.#{start_time}")
           @error_count = 0
         rescue Aws::Errors::MissingCredentialsError
           log(:error, "Missing credentials - please check if this instance was started with an IAM instance profile")

This will produce a memory dump every minute and after every step during deployment.

@Raniz85
Copy link

Raniz85 commented Apr 6, 2017

I just tried replacing the memory dump with GC.start and there's zero difference in memory usage after deployment so it doesn't seem like an issue with Ruby.

Basic stats of the dumps (with memdump) shows that there does seem to be significant leakage in the agent:

$ memdump stats pre-deploy.txt | sort -k 2 -n | grep -i aws | tail -n10
Aws::Json::Parser: 12
Aws::Json::Builder: 13
Aws::Json::Handler: 14
Aws::ParamConverter: 14
Aws::ParamValidator: 14
Aws::Signers::S3: 14
Aws::Plugins::RetryErrors::Handler: 15
Aws::Log::Formatter: 22
Aws::Signers::V4: 23
Aws::SharedConfig: 24
$ memdump stats post-deploy.txt | sort -k 2 -n | grep -i aws | tail -n10
Aws::Plugins::RetryErrors::Handler: 32
Aws::S3::Types::UploadPartCopyRequest: 34
Aws::S3::Types::GetObjectRequest: 38
Aws::Signers::V4: 41
Aws::S3::Types::CreateMultipartUploadRequest: 44
Aws::S3::Types::HeadObjectOutput: 52
Aws::S3::Types::PutObjectRequest: 52
Aws::S3::Types::GetObjectOutput: 58
Aws::S3::Types::CopyObjectRequest: 66
Aws::S3::Client: 74

@jcowley
Copy link

jcowley commented Apr 6, 2017

@Raniz85 In my experience, the memory consumption seems to be proportional to the size of the build asset files. Maybe the files are being read into memory and that memory is never released?

@Jmcfar Jmcfar added the bug label Apr 6, 2017
@Jmcfar Jmcfar self-assigned this Apr 6, 2017
@Jmcfar
Copy link
Contributor

Jmcfar commented Apr 6, 2017

Thanks for the research everyone! For those still running into this issue, can you include the specific version (ruby -v) of the Ruby run-time you are using with the CodeDeploy agent?

@thegranddesign
Copy link

thegranddesign commented Apr 6, 2017

@Jmcfar 2.3, 2.3.1, 2.4, 2.4.1

Built from source on Ubuntu 16.04

@jcowley
Copy link

jcowley commented Apr 6, 2017

@Jmcfar: ruby 2.2.6p396 (2016-11-15 revision 56800) [x86_64-linux-gnu]

Running on ubuntu trusty (14.04.5)

@Raniz85
Copy link

Raniz85 commented Apr 7, 2017

$ uname -a && ruby -v
Linux ip-10-10-191-49 4.4.0-66-generic #87-Ubuntu SMP Fri Mar 3 15:29:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
ruby 2.3.1p112 (2016-04-26) [x86_64-linux-gnu]

@jcowley
Copy link

jcowley commented Jun 1, 2017

@Jmcfar : Can you give us an update on progress please?

@jcowley
Copy link

jcowley commented Jun 12, 2017

@Jmcfar : Can we get an update on this issue?

@dlo
Copy link

dlo commented Jul 17, 2017

Running into this on Ubuntu 16.04.

$ uname -a && ruby -v
Linux production 4.4.0-1022-aws #31-Ubuntu SMP Tue Jun 27 11:27:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
ruby 2.3.1p112 (2016-04-26) [x86_64-linux-gnu]

@jcowley
Copy link

jcowley commented Jul 19, 2017

Seems that AWS Code Deploy is not quite ready for prime time. Does the aws-codedeploy-agent project even have an active maintainer at AWS? I see open pull requests that have been hanging around for a month with no comment.

@goodboybeau
Copy link

@jcowley AWS ain't all it's cracked up to be

@floodedcodeboy
Copy link

2 years and this is still and issue? We have hit the same issue - and seemingly even Enterprise Level Support from AWS will not push them to a resolution on this matter.

@reneruiz
Copy link

Hi, is there an update on this issue?

@ChrisLahaye
Copy link

This ticket is open for five years now. Do yourself a favor and choose for another product.

@abbasghulam
Copy link

Let's give a standing ovation to AWS technical team on this issue.

I am having strong vibes that it's AWS tactic to enforce customers to use high memory instances so they can make extra.

CodeDeploy consuming 50% memory on t2.small instance whereas t2.micro is enough for our requirements.

@annamataws

@paulca99
Copy link

Hi all,

I am sorry you are experiencing a memory leak caused by the CodeDeploy agent. We are aware of the problem and we are testing a fix that will be released along with changes to our agent release process so we can incorporate changes more quickly in the future. We apologize for the delay.

It's been 6 months ...no update... @annamataws ???

@nicklaros
Copy link

Well, at this point, I lost all interest in this feature. Such a waste.

austinmarchese added a commit to banter/banter-tag-generator that referenced this issue Jun 12, 2020
1. Going throw normal deploy process until build
2. During build stage downloading and creating python packages, then these python packages are kept and used once on the server to decrease down time
3a. Attempted to install STANZA models during build process, however due to overall size and AWS EC2 containers maintaining 5 versions of previous builds +last successful one, this
lead to the container getting full causing errors, as a result stanza models are downloaded as part of the startup script
3b. To try and combat this, a remove-archives.sh file is used, I still got the issue with this file when downloading stanza so I kept but also pushed the stanza process to Deploy Step
4. Prior to starting download, a swap is created on the instance in the case of a influx of memory usage to try and maintain container health (see create-swap.sh)
5. After this, stanza is downloaded and the service is brought up
6. As part of the Validation Service a restart-codedeploy-agent.sh file is being used which backgrounds a process to restart code deploy after the entire deployment is complete. This is due to a memory leak in AWS code deploy listeners that increaes memory usage each deployment (see: aws/aws-codedeploy-agent#32)

A Auto Scalling Group is used to handle instance restarts/turnover as memory issue can cause health of services to go down

Initial test on lower branch
@erixero
Copy link

erixero commented Jul 1, 2020

Please fix this issue ASAP, if I were aware of this problem I will never go for AWS Pipeline, unfortunately, I already set-up all my CI/CD process and this agent keep getting unresponsive and causing a lot of headaches. If no fix soon, I will need to migrate to a more robust service like Jenkins

@luisdalmolin
Copy link

Just as a suggestion, but what I did to solve this is put a job in the queue in the end of my deployment that restarts the code deploy agent, so the memory usage is always "low". I wish I didn't have to do that, but it solves the problem. At least mine.

@Helen1987
Copy link
Contributor

Please make sure you set up SSM Distributor integration with CodeDeploy if you want to get a new version of CodeDeploy Agent. Otherwise you will not get the latest version of Agent with Wndows fix (actually, you will get it in some rear cases, but in general you will need to update it manually without Distributor). We are releasing a new version of the Agent, so please onboard with Distributor if you want to get one.

@paulca99
Copy link

paulca99 commented Jul 2, 2020

@Helen1987 when you say "with Windows fix" do you mean it addresses codedeploy memory usage on Windows OS ? If so I'd be surprised if that addresses 1% of the problem as I'm sure most people deploy on Linux.

@Manc
Copy link

Manc commented Jul 2, 2020

In my case, a download of a ~50 MB ZIP bundle to an EC2 instance now takes several minutes with the CodeDeploy agent and risks that a small EC2 instance runs out of memory, while the same download on the same machine with aws-cli takes far less than 1 second.

My workaround now is to change the CodeBuild process of all flows, so that the main ZIP bundle no longer contains the actual application, but only the appspec.yml, the individual hook scripts and a text file with the build version, used as reference for the actual application. The built application itself is being archived as tar.gz and uploaded as secondary artifact to S3. During deployment CodeDeploy will only have to download a very small ZIP bundle (less than 3KB) and my hook script then downloads the referenced tar.gz archive using an aws-cli command.

The entire deployment now went down from around 4 to 6 minutes to ~6 seconds, including stopping the app, download, installation, starting and verification.

I hope this may help some users to find a solution that works for them.

@paulca99
Copy link

paulca99 commented Jul 2, 2020

I went down the same route Manc, but used an EFS drive to hold the build.

@Helen1987
Copy link
Contributor

@paulca99 sorry, did not realize it is “high memory” thread. Correction: new release includes high memory fix as well

@jason-riddle
Copy link

@paulca99 sorry, did not realize it is “high memory” thread. Correction: new release includes high memory fix as well

@Helen1987 Just to clarify, this github issue is resolved then correct? So this issue can be closed?

@AnandarajuCS
Copy link

This issue is fixed in our latest release v1.1.2
https://aws.amazon.com/about-aws/whats-new/2020/08/aws-codedeploy-agent-improved-compatibility-amazon-linux-windows-ubuntu/

@fleaz
Copy link

fleaz commented Aug 5, 2020

I never thought this day would come...

@spaivaras
Copy link

Doesn't seem to fix anything related with this, still, each deployment adds additional memory usage on codedeploy-agent process and it seems it never releases it. Version used: agent_version: OFFICIAL_1.1.2-1855_deb

@fleaz
Copy link

fleaz commented Sep 9, 2020

@spaivaras Does your system has unzip installed? Afaik is the problem with the high memory consumption a result of using the 'ruby-zip' package to unzip the deployment pacakge, which is still used as a fallback if your system has no unzip binary available

@spaivaras
Copy link

@fleaz yes it does


unzip:
  Installed: 6.0-23+deb10u1
  Candidate: 6.0-23+deb10u1
  Version table:
 *** 6.0-23+deb10u1 500
        500 http://cdn-aws.deb.debian.org/debian buster/main amd64 Packages
        100 /var/lib/dpkg/status

Also tried both gzip and zip as artifacts.

@arces
Copy link

arces commented Sep 15, 2020

2nd what @spaivaras mentioned. Just updated to the latest version and it still using more RAM after every deploy. Unzip is also installed

@arces
Copy link

arces commented Sep 15, 2020

@AnandarajuCS

@ronnypmuliawan
Copy link

ronnypmuliawan commented Sep 17, 2020

Hi,

I am experiencing the same issue too. And it seems like the memory leak gets even worse after the update to the latest version.

Relevant packages
Using Ubuntu 18.04 with latest CodeDeploy agent version 1.1.2-1855.
Ruby 2.5.lp57 (2018-03-29 revision 63029) [x86_64-linux-gnu]
UnZip 6.00 of 20 April 2009, by Debian. Original by Info-ZIP.

@toolslive
Copy link

agent version: codedeploy-agent               1.2.1-1868       
os version:Ubuntu 20.04.1 LTS
unzip version: 6.0-25ubuntu1

problem still present.

@mike-mccormick
Copy link

@AnandarajuCS @feverLu @amartyag or any other contributors, is there any chance we could get this one re-opened?

I've been seeing this intermittently for some time, I've just upgraded from 1.0.1 to the current build of 1.2.1 today, and the results (before and after, with a manual restart, just to be sure) are the same; codedeploy-agent / ruby eats around 150MB at a time and doesn't want to release it.

Setup:

  • Amazon Linux 2 using the official Amazon image on a two month old Ec2 Instance
  • CodeDeploy Agent 1.2.1-1868.
  • UnZip 6.00 (Pre-installed on the system)
  • Ruby ruby 2.0.0p648 (also pre-installed, a little out of date, but the CodeDeploy agent didn't seem to have any requirements).

I'm getting nothing in the logs other than the usual polling.

This particular instance is a fairly simple Wordpress t2.nano staging server. With only internal traffic, there's no issues with resource, except when CodeDeploy is ran.

Happy to troubleshoot if necessary.

@brndnblck
Copy link
Contributor

brndnblck commented Oct 26, 2020

This specific defect was resolved and I'm going to lock this thread to prevent it from being errantly re-opened for any new issues that exhibit similar problems.

To give some more insight into the resolution, the leaks in this issue were being caused by the 3rd-party in-memory library being used to unpack deployment bundles. This library had some memory optimization deficiencies fundamental to its design and our solution was to replace it using the system zip which is more efficient.

However, most system zip tools aren't as tolerant of defects in the bundles. If the zip file your build system creates isn't to spec or is malformed in anyway, the system zip will fail unpacking it. As a result, for maximum compatibility our solution is designed to fall back to the less memory efficient library for unpacking if there is either 1) no system zip present or 2) the system zip deems the deployment bundle as malformed in some way.

If you've confirmed the system zip is present and functioning and the agent can see it, but you're still running into this issue, you should validate the bundle being unpacked under the system zip and look for warnings or failures when unpacking. If you ensure that the tool being used to build your bundle is doing so in a way that doesn't produce warnings or failures in your system zip you should be able to resolve this issue.

@mike-mccormick if you identify that something else is happening here and you'd like to help us troubleshoot any additional memory leaks you're running into, let's open up a new issue and provide all the relevant details there rather than conflate any new problems with this old and already resolved thread.

When you open a new issue, please remember to do the following first:

  1. Enable verbose logging on your agent.
    https://docs.aws.amazon.com/codedeploy/latest/userguide/reference-agent-configuration.html

  2. Provide agent logs which capture the unpack step of the deployment process.
    https://docs.aws.amazon.com/codedeploy/latest/userguide/deployments-view-logs.html

@aws aws locked as resolved and limited conversation to collaborators Oct 26, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests