Skip to content

Add OTLP HTTP MetricExporter max_export_batch_size #4576

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

tammy-baylis-swi
Copy link
Contributor

@tammy-baylis-swi tammy-baylis-swi commented May 9, 2025

Description

Adds support for HTTP OTLPMetricExporter configurable max_export_batch_size, like the gRPC OTLPMetricExporter already does (completed through issue #2710 with PR #2809).

This is currently much longer than the gRPC version because:

  1. HTTP protobuf representations of ResourceMetrics, ScopeMetrics, etc are not replace-able like the gRPC data classes
    • So references are stored and new protobuf objects are created immediately before yield/export
  2. protobuf does not define a DataPointT to encompass all metric types
    • So I've added some if-elif throughout for accessing data points, creating new metrics objects

Fixes #4577

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Added unit tests
  • Install OTLPMetricExporter locally and do the following:
  1. Set up 3 counters, with add(1) calls to each
  2. Init global MeterProvider using OTLPMetricExporter with max_export_batch_size=2 and endpoint as Collector (debug).
  3. Run to get export in 2 batches (2 counters + 1 counter) of 1 ResourceMetrics each.

Does This PR Require a Contrib Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@tammy-baylis-swi tammy-baylis-swi marked this pull request as ready for review May 9, 2025 19:07
@tammy-baylis-swi tammy-baylis-swi requested a review from a team as a code owner May 9, 2025 19:07
@tammy-baylis-swi
Copy link
Contributor Author

I think the aiohttp-client test failure is a hiccup from the recent release, not from changes in this PR.

# used to write batched pb2 objects for export when finalized
split_resource_metrics = []

for resource_metrics in metrics_data.resource_metrics:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This borrows from the 4-deep for-each loop approach in the gRPC exporter.


for resource_metrics in metrics_data.resource_metrics:
split_scope_metrics = []
split_resource_metrics.append(
Copy link
Contributor Author

@tammy-baylis-swi tammy-baylis-swi May 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The split_* lists store references because the HTTP protobuf representations of ResourceMetrics, ScopeMetrics, etc are not replace-able like the gRPC data classes. I'm not sure if there's a better way -- doing CopyFrom and clear is still quite involved.

# with different accessors for data points, etc
# We maintain these structures throughout batch calculation
current_data_points = []
if metric.HasField("sum"):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another difference between http protobuf and the gRPC classes are that protobuf does not define a DataPointT to encompass all metric types. So I've added some if-elif throughout. Not sure if there is a better way.

)
)

def _get_split_resource_metrics_pb2(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper is used to write actual pb2 objects before each yield back to the export caller.


if split_resp.ok:
export_result = MetricExportResult.SUCCESS
elif self._retryable(split_resp):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't we need to remove the success/failure batches from the entire list when we attempt a retry? For example, batch size of 1 and there are 2 metric datas. The first one returns retryable and the second one returns failure. When the loop returns to the outer loop, we don't want to retry the one that failed again right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, no we don't want to retry a non-retryable failure!

I reorganized the unbatched-vs-batched and, for the latter, made batch the outer loop and delay the inner loop with breaks for clarity in f1ef6c4.

@tammy-baylis-swi tammy-baylis-swi requested a review from lzchen May 12, 2025 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add HTTP OTLPMetricExporter configurable max export batch size, like gRPC
2 participants