Skip to content

Commit 940e774

Browse files
Alena KastsiukavetsHelen1987
Alena Kastsiukavets
authored andcommitted
#Fix Win 2019 issue
To run program on windows all files are copied in Windows\Temp folder that may occasionally be cleaned up. When certs are deleted, poller cannot verify call and raises exception. # Why is this change needed? Customers are suffering from "Agent occasionally stops working" issue # How does it address the issue? Propagate certificate exception, so win service can handle it properly. Win service tries to copy certificates from PROGRAMDATA folder, so agent can continue to work properly # How was this tested ? bb release cr https://code.amazon.com/reviews/CR-19039551
1 parent e0b2190 commit 940e774

File tree

2 files changed

+19
-3
lines changed

2 files changed

+19
-3
lines changed

lib/instance_agent/agent/base.rb

+4
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,10 @@ def run
2727
begin
2828
perform
2929
@error_count = 0
30+
rescue Seahorse::Client::NetworkingError => e
31+
log(:error, "Failed to execute the command. Your certificates might have been deleted" )
32+
# TODO: verify error message is "certificate verify failed"
33+
raise e
3034
rescue Aws::Errors::MissingCredentialsError
3135
log(:error, "Missing credentials - please check if this instance was started with an IAM instance profile")
3236
@error_count = @error_count.to_i + 1

lib/winagent.rb

+15-3
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,10 @@
1717

1818
class InstanceAgentService < Daemon
1919

20-
def initialize
20+
def initialize
21+
@app_root_folder = File.join(ENV['PROGRAMDATA'], "Amazon/CodeDeploy")
2122
InstanceAgent::Platform.util = InstanceAgent::WindowsUtil
23+
2224
cert_dir = File.expand_path(File.join(File.dirname(__FILE__), '..\certs'))
2325
Aws.config[:ssl_ca_bundle] = File.join(cert_dir, 'ca-bundle.crt')
2426
ENV['AWS_SSL_CA_DIRECTORY'] = File.join(cert_dir, 'ca-bundle.crt')
@@ -32,6 +34,7 @@ def description
3234

3335
def service_main
3436
read_config
37+
@attempt_count = 0
3538
log(:info, 'started')
3639
shutdown_flag = false
3740
while running? && !shutdown_flag
@@ -73,8 +76,7 @@ def expand_conf_path(key)
7376
end
7477

7578
def read_config
76-
default_root = File.join(ENV['PROGRAMDATA'], "Amazon/CodeDeploy")
77-
default_config = File.join(default_root, "conf.yml")
79+
default_config = File.join(@app_root_folder, "conf.yml")
7880
InstanceAgent::Config.config({:config_file => default_config,
7981
:on_premises_config_file => File.join(default_root, "conf.onpremises.yml")})
8082
InstanceAgent::Config.load_config
@@ -87,6 +89,16 @@ def read_config
8789

8890
def with_error_handling
8991
yield
92+
rescue Seahorse::Client::NetworkingError => e
93+
@attempt_count = @attempt_count + 1
94+
if @attempt_count > 3
95+
log(:error, "Failed to recover after certificate issue:" + e.inspect)
96+
exit
97+
end
98+
log(:error, "Custom:" + e.inspect)
99+
# try to copy certs from application root folder
100+
@certs_backup_folder = File.join(@app_root_folder, "certs/.")
101+
FileUtils.cp_r(@certs_backup_folder, @cert_dir)
90102
rescue SocketError => e
91103
log(:info, "#{description}: failed to run as the connection failed! #{e.class} - #{e.message} - #{e.backtrace.join("\n")}")
92104
sleep InstanceAgent::Config.config[:wait_after_connection_problem]

0 commit comments

Comments
 (0)