-
Notifications
You must be signed in to change notification settings - Fork 197
[deb/rpm] restart endpoint with tamper protection after elastic-agent #8637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[deb/rpm] restart endpoint with tamper protection after elastic-agent #8637
Conversation
|
💚 Build Succeeded
History
|
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be backported to 8.18 and 9.0 as well. That being said, the changes look good to me. Thank you very much for helping out.
Edit: This should be backported to 9.0 because the initial PR was backported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you for quickly fix this issue
…tion after elastic-agent (#8646) * [deb/rpm] restart endpoint with tamper protection after elastic-agent (#8637) * fix: use rpm from local build (cherry picked from commit 249885f) # Conflicts: # dev-tools/packaging/templates/linux/postinstall.sh.tmpl # testing/integration/endpoint_security_test.go * Enhancement/6394 allow deb rpm to upgrade with endpoint tamper protection (#6907) * Update pkg/testing/tools/tools.go Co-authored-by: Paolo Chilà <[email protected]> * enhancement(6394): updated preinstall script, updated service to use uninstall token * enhancmenet(6394): updated the preinstall script * enchancement(6394): started adding integraiton tests * enhancement(6394): updated fixture install, updated endpoint security tests * enhancement(6394): cleaned up fixture_install, added function that exposes fixture's uninstall tokens, updated tests * enhancement(6394): refactored test code so that I can use it with rpm * enhancement(6394): added tests to assert that tamper protection works * enhancement(6394): updated the endpoint testing tools, fixture install functions and the deb rpm upgrade tests * enhancement(6394): added test logs, updated rpm installation to set agent socket path * enhancement(6394): remove commented code * enhancement(6394): remove print statements * enhancement(6394): remove unnecessary comments, refactor unused function * enhancement(6394): revert var name change * enhancement(6394): added changelog * enchancement(6394): update test logs, add non integrative config to deb installation * enhancement(6394): updated the endpoint version comparison and assertion * enhancement(6394): added log in tests * enhancement(6394): resorted to using previous major instead of minor in upgrade test * enhancement(6394): updated endpoint version function in the tests, updated function name in testing tools * enhancement(6394): use previous minor, fix log * enhancement(6394): added comment explaining motive behind simple install functions * enhancement(6394): updated return in tools * Update changelog/fragments/1740166208-allow-deb-rpm-upgrade-with-tamper-protected-endpoint.yaml Co-authored-by: Craig MacKenzie <[email protected]> * enhancement(6394): fixed function call in tests * enhancement(6394): added systemctl start in postinstall, refactored preinstall and added condition to make same version installations work * enhancement(6394): updated the preinstall and postinstall scripts to troubleshoot * enhancement(6394): updated preinstall and postinstall script templates - Updated preinstall to stop endpoint if it is an available service regardless of the version of endpoint that's install - Updated postintall to start endpoint if the old endpoint version and the new version match. * enhancement(6394): removed error exit from postinstall * enhancement(6394): updated postinstall and preinstall templates - Preinstall now does not use a state file. Recovery from failure start ElasticEndpoint if it is not running - Preinstall does not stop endpoint if tamper protection is not enabled - Postinstall does not print an error if service is still running * enhancement(6394): removed debug logs * enhancement(6394): removed unnecessary comment * enhancement(6394): store uninstall token as local var, uninstall through the agent * enhancement(6394): added setclient function * enhancement(6394): added getInstallCommand and replaced SimpleInstall * enhancement(6394): added test case for error recovery. removed unused fixture functions * enhancement(6394): refactored tests, consolidated test scenarios into one function * enhancement(6394): remove unnecessary test functions * enhancement(6394): remove unused fixture function * enhancement(6394): revert unwanted installDeb changes * enhancement(6394): remove unwanted changes in testing tools * enhancement(6394): remove unused function call * enhancement(6394): replacing systemctl instead of adding new one to path * enhancement(6394): update real systemctl path in mock systemctl script * enhancement(6394): fix linting errors * Update changelog/fragments/1740166208-allow-deb-rpm-upgrade-with-tamper-protected-endpoint.yaml Co-authored-by: Paolo Chilà <[email protected]> * Update dev-tools/packaging/templates/linux/postinstall.sh.tmpl Co-authored-by: Paolo Chilà <[email protected]> * Update pkg/testing/tools/tools.go Co-authored-by: Paolo Chilà <[email protected]> * Update dev-tools/packaging/templates/linux/postinstall.sh.tmpl Co-authored-by: Paolo Chilà <[email protected]> * Update dev-tools/packaging/templates/linux/postinstall.sh.tmpl Co-authored-by: Paolo Chilà <[email protected]> * Update pkg/testing/tools/tools.go Co-authored-by: Paolo Chilà <[email protected]> * enhancement(6394): updated print statement * enhancement(6394): remove unnecessary command * enhancement(6394): use addressFromPath and SetClient * enhancement(6394): using service name, fixed indentation * test(debug): add detailed logging to Fixture.SetClient and installDeb for agent client setup debugging * Revert "test(debug): add detailed logging to Fixture.SetClient and installDeb for agent client setup debugging" This reverts commit 390c561. * enhancement(6394): renamed SetClient to SetDebRpmClient. Using hardcoded working dir as fixture working dir does not work for determining socket path * enhancement(6394): consolidated same version upgrade and regular upgrdade test functions * enhancement(6394): simplify preinstall script and enhance upgrade tests for tamper protection - Removed unnecessary endpoint handling logic from preinstall script. - Improved checks for service installation and status before upgrade. - Updated upgrade test functions to handle stopping the endpoint service before upgrades. * enhancement(6394): remove mock systemctl script for tamper protection tests * enhancement(6394): remove unused import * enhancement(6394): fixed order of execution in preinstall * enhancement(6394): added tests to make sure deb/rpm upgrades work when endpoint is not tamper protected --------- Co-authored-by: Paolo Chilà <[email protected]> Co-authored-by: Craig MacKenzie <[email protected]> (cherry picked from commit 8a6531f) # Conflicts: # dev-tools/packaging/templates/linux/preinstall.sh.tmpl # Conflicts: # dev-tools/packaging/templates/linux/postinstall.sh.tmpl # testing/integration/endpoint_security_test.go * fix: resolve conflicts * fix: use --force-confold for deb tests in TestUpgradeAgentWithTamperProtectedEndpoint_DEB --------- Co-authored-by: Panos Koutsovasilis <[email protected]> Co-authored-by: Kaan Yalti <[email protected]>
…-hosted * feature/hosted-stack-using-oblt-cli: (26 commits) Use the current official docker image for oblt-cli Mark the elasticinframetrics processor as deprecated and schedule for removal (#8659) [main][Automation] Update versions (#8668) chore: Update create_deployment_csp_configuration.yaml (#8669) Attempt to make test more reliable by querying ES directly (#8422) [test] split up ess and beats serverless integration tests (#8551) Remove resource/k8s processor and use k8sattributes processor for service attributes (#8599) fix: use --force-confold for deb tests in TestUpgradeAgentWithTamperProtectedEndpoint_DEB (#8649) [main][Automation] Bump stack images versions to 9.1.0-ea0b7542 (#8612) chore: Update to elastic/beats@f6594fb72670 (#8640) [deb/rpm] restart endpoint with tamper protection after elastic-agent (#8637) ci: don't preinstall fleet packages on retried CI steps (#8636) chore: Update to elastic/beats@6b6941eed496 (#8619) [main][Automation] Bump VM Image version to 1750467641 (#8617) flaky: skip TestUpgradeAgentWithTamperProtectedEndpoint_RPM (#8626) Add skip-changelog PR label for bump VM PRs (#8627) build(deps): bump github.com/elastic/go-seccomp-bpf from 1.5.0 to 1.6.0 (#8611) [ci] fix k8s integration tests flakiness (#8575) bump apmconfig Otel extension to v0.3.0 (#8600) Enhancement/6394 allow deb rpm to upgrade with endpoint tamper protection (#6907) ...
What does this PR do?
This PR fixes a regression introduced by #6907, which updated the RPM/DEB preinstall script to stop the
ElasticEndpoint
service during agent upgrades to work around tamper protection restrictions. While effective in stopping the service, the original change restarted the endpoint before restarting the agent. This sequence causes most of the time endpoint to try and reconnect to elastic-agent but without any time guarantees when this is gonna be successful.To address this, the PR:
ElasticEndpoint
service after theelastic-agent
service has been restarted to guarantee that elastic-endpoint can connect to elastic-agent.Why is it important?
Improper ordering of service restarts during DEB/RPM upgrades with endpoint tamper protection enabled was causing the endpoint to start independently of the agent, resulting in "always-retrying" and sporadic degraded operation. This fix ensures the services are brought up in the correct order to maintain endpoint health.
Checklist
./changelog/fragments
using the changelog toolDisruptive User Impact
How to test this PR locally
Related issues