Skip to content

Incorrect StartTimeUnixNano for ExponentialHistogram with Delta Temporality after gRPC connection drop #3974

@alexchowle

Description

@alexchowle

Describe your environment

OS: CentOS 7
Python version: 3.11.1
SDK version: 1.25.0
API version: 1.25.0

What happened?

As part of investigating #3971 I found that the loss and subsequent re-establishment of a gRPC connection to a Collector caused the next metric export to have a very old StartTimeUnixNano value

Steps to Reproduce

metrics_temporality = {
        Histogram: AggregationTemporality.DELTA
    }
otlp_metric_exporter = OTLPMetricExporter(endpoint=grpc_collector_address, insecure=True,
                                              preferred_temporality=metrics_temporality, timeout=30_000)
metric_reader = PeriodicExportingMetricReader(otlp_metric_exporter,
                                                  export_interval_millis=60_000))
meter_provider = MeterProvider(metric_readers=[metric_reader], resource=_create_resources(),
                                   views=[View(instrument_type=Histogram,
                                               aggregation=ExponentialBucketHistogramAggregation())])
metrics.set_meter_provider(meter_provider)

meter = metrics.get_meter(__name__)

h = meter.create_histogram(
        name='my_histogram', unit='s', description='just my histogram'
    )

h.record(1.0)
time.sleep(360)
h.record(2.0)

Have the OTel Collector drop the client's gRPC connection by setting the max_connection_age to 5m (as documented at https://github.com/open-telemetry/opentelemetry-collector/blob/main/config/configgrpc/README.md#server-configuration)

So the first recording happens, then within the 6-minute sleep the Collector drops the client's gRPC connection within 5 minutes, then the client sends a second observation after the 6 minutes has elapsed.

Expected Result

StartTimeUnixNano has a value that is within 60 seconds of the 2nd recording. This is how the Golang SDK works.

Actual Result

StartTimeUnixNano has the TimeUnixNano of observation # 1 i.e. about 6 minutes in the past.

Observation # 1
-----------------
Metric #0                                                                                                
Descriptor:                                                                                              
     -> Name: http.server.request.duration                                                               
     -> Description: Duration of HTTP server requests                                                    
     -> Unit: s                                                                                          
     -> DataType: ExponentialHistogram                                                                   
     -> AggregationTemporality: Delta                                                                    
ExponentialHistogramDataPoints #0                                                                        
Data point attributes:
     -> http.request.method: Str(POST)
     -> http.route: Str(/sources/raw)
     -> http.response.status_code: Int(202)
     -> url.scheme: Str(http)
StartTimestamp: 2024-06-14 06:25:40.301137322 +0000 UTC
Timestamp: 2024-06-14 06:25:53.980223433 +0000 UTC
Count: 1
Sum: 0.021235
Min: 0.021235
Max: 0.021235
Bucket [-1.000001, -1.000000), Count: 0
Bucket (0.021235, 0.021235], Count: 1


Observation #2
-----------------
Metric #0
Descriptor:
     -> Name: http.server.request.duration
     -> Description: Duration of HTTP server requests
     -> Unit: s
     -> DataType: ExponentialHistogram
     -> AggregationTemporality: Delta
ExponentialHistogramDataPoints #0
Data point attributes:
     -> http.request.method: Str(POST)
     -> http.route: Str(/sources/raw)
     -> http.response.status_code: Int(202)
     -> url.scheme: Str(http)
StartTimestamp: 2024-06-14 06:25:53.980223433 +0000 UTC
Timestamp: 2024-06-14 06:31:53.98886037 +0000 UTC
Count: 1
Sum: 0.018851
Min: 0.018851
Max: 0.018851
Bucket [-1.000001, -1.000000), Count: 0
Bucket (0.018851, 0.018851], Count: 1

Additional context

No response

Would you like to implement a fix?

None

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions