Skip to content

Conversation

mergify[bot]
Copy link
Contributor

@mergify mergify bot commented Sep 25, 2025

What does this PR do?

This PR introduces support for emitting system resource metrics for the EDOT (Elastic Distribution of OpenTelemetry) collector when it runs as a subprocess of the Elastic Agent.

Key changes:

  • Added a new features.agent.otel.subprocess_execution feature flag to control whether the OTel collector runs in subprocess execution mode.
    • This flag defaults to false for now (maintaining existing behaviour), but is expected to default to true in the imminent future.
  • Extended the monitoring configuration generation logic to create a dedicated HTTP metrics stream for the EDOT subprocess, going by the name elastic-agent/collector, capturing only its system resources usage.
  • Updated OTelManager construction to honor the execution mode parsed from the feature flags, rather than always running in embedded mode.

Why is it important?

Running the collector as a subprocess improves resilience by isolating the control plane from the data plane. Emitting metrics for the EDOT process ensures operational visibility, allowing users to observe and troubleshoot its resource usage independently from the main Elastic Agent process.

Checklist

  • I have read and understood the pull request guidelines of this project.
  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool
  • I have added an integration test or an E2E test

Disruptive User Impact

No disruptive impact expected.

  • When features.agent.otel.subprocess_execution is false (default), behavior is unchanged.
  • When the flag is enabled, users will see an additional monitoring stream for elastic-agent/collector.

How to test this PR locally

  1. Build Elastic Agent from this branch.
  2. Enable the subprocess execution mode in your elastic-agent.yml:
    agent:
      features:
        otel:
          subprocess_execution: true
  3. Install elastic-agent.
  4. Verify in Kibana’s Agent Metrics dashboard that a separate metrics stream for elastic-agent/collector appears. PS: you might have to install the elastic-agent integration if it's not already installed
Screenshot 2025-09-17 at 2 39 39 PM

Related issues

N/A


This is an automatic backport of pull request #10003 done by [Mergify](https://mergify.com).

* feat: emit system resource metrics for EDOT subprocess

* ci: extend unit-tests to cover for the edot subprocess resource metrics stream

* feat: add standalone monitoring server in supervised EDOT

* feat: move otel execution mode feature flag to a separate package

* feat: rework otel config package to avoid globals

(cherry picked from commit 9f15088)

# Conflicts:
#	internal/pkg/agent/application/monitoring/v1_monitor.go
@mergify mergify bot added backport conflicts There is a conflict in the backported pull request labels Sep 25, 2025
@mergify mergify bot requested a review from a team as a code owner September 25, 2025 07:51
@mergify mergify bot requested review from kaanyalti and swiatekm and removed request for a team September 25, 2025 07:51
Copy link
Contributor Author

mergify bot commented Sep 25, 2025

Cherry-pick of 9f15088 has failed:

On branch mergify/bp/8.19/pr-10003
Your branch is up to date with 'origin/8.19'.

You are currently cherry-picking commit 9f15088ce.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   internal/pkg/agent/application/application.go
	modified:   internal/pkg/agent/application/monitoring/v1_monitor_test.go
	modified:   internal/pkg/agent/cmd/inspect.go
	modified:   internal/pkg/agent/cmd/otel.go
	modified:   internal/pkg/agent/cmd/otel_flags.go
	modified:   internal/pkg/agent/cmd/otel_test.go
	new file:   internal/pkg/otel/config/config.go
	modified:   internal/pkg/otel/manager/execution_subprocess.go
	modified:   internal/pkg/otel/manager/testing/testing.go
	new file:   internal/pkg/otel/monitoring/monitoring.go

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   internal/pkg/agent/application/monitoring/v1_monitor.go

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

@github-actions github-actions bot added Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team skip-changelog labels Sep 25, 2025
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@pkoutsovasilis pkoutsovasilis removed the conflicts There is a conflict in the backported pull request label Sep 25, 2025
Copy link

Quality Gate failed Quality Gate failed

Failed conditions
36.3% Coverage on New Code (required ≥ 40%)

See analysis details on SonarQube

@elasticmachine
Copy link
Collaborator

💛 Build succeeded, but was flaky

Failed CI Steps

cc @pkoutsovasilis

@pkoutsovasilis pkoutsovasilis merged commit 77955e7 into 8.19 Sep 25, 2025
19 of 20 checks passed
@pkoutsovasilis pkoutsovasilis deleted the mergify/bp/8.19/pr-10003 branch September 25, 2025 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants