Skip to content

feat(om2): add native histograms to OpenMetrics2.0 #2634

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 75 additions & 16 deletions content/docs/specs/om/open_metrics_spec_2_0.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,15 @@ This section MUST be read together with the ABNF section. In case of disagreemen

#### Values

Metric values in OpenMetrics MUST be either floating points or integers. Note that ingestors of the format MAY only support float64. The non-real values NaN, +Inf and -Inf MUST be supported. NaN MUST NOT be considered a missing value, but it MAY be used to signal a division by zero.
Metric values in OpenMetrics MUST be either numbers or complex data types.

Numbers MUST be either floating points or integers. Note that ingestors of the format MAY only support float64. The non-real values NaN, +Inf and -Inf MUST be supported. NaN MUST NOT be considered a missing value, but it MAY be used to signal a division by zero.

Complex data types MUST contain all information necessary to recreate a Metric Type, with the exception of Created time and Exemplars.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assume we'll have the created timestamp separate from the JSON like data. prometheus/OpenMetrics#285


List of complex data types:
- Integer (counter) native histograms,
- Integer gauge native histograms.

##### Booleans

Expand Down Expand Up @@ -218,42 +226,93 @@ MetricFamilies of type Info MUST have an empty Unit string.

Histograms measure distributions of discrete events. Common examples are the latency of HTTP requests, function runtimes, or I/O request sizes.

A Histogram MetricPoint MUST contain at least one bucket, and SHOULD contain Sum, and Created values. Every bucket MUST have a threshold and a value.
A Histogram MetricPoint MUST contain either classic buckets or exponential buckets or both.

Histogram MetricPoints MUST have one bucket with an +Inf threshold. Buckets MUST be cumulative. As an example for a metric representing request latency in seconds its values for buckets with thresholds 1, 2, 3, and +Inf MUST follow value_1 <= value_2 <= value_3 <= value_+Inf. If ten requests took 1 second each, the values of the 1, 2, 3, and +Inf buckets MUST equal 10.
A Histogram MetricPoint SHOULD contain Count, Sum, and Created values. Every bucket MUST have well defined boundaries and a value. Boundaries of a bucket MUST NOT be NaN. Count and bucket values MUST be integers.

The +Inf bucket counts all requests. If present, the Sum value MUST equal the Sum of all the measured event values. Bucket thresholds within a MetricPoint MUST be unique.
Semantically, Count, and buckets values are counters so MUST NOT be NaN or negative.

Semantically, Sum, and buckets values are counters so MUST NOT be NaN or negative.
Negative threshold buckets MAY be used, but then the Histogram MetricPoint MUST NOT contain a sum value as it would no longer be a counter semantically. Bucket thresholds MUST NOT equal NaN. Count and bucket values MUST be integers.
The Sum is only a counter semantically as long as there are no negative event values measured by the Histogram MetricPoint. The Sum MUST NOT be NaN. If present, the Sum value MUST equal the Sum of all the measured event values.

A Histogram MetricPoint SHOULD have a Timestamp value called Created. This can help ingestors discern between new metrics and long-running ones it did not see before.

A Histogram's Metric's LabelSet MUST NOT have a "le" label name.

Bucket values MAY have exemplars. Buckets are cumulative to allow monitoring systems to drop any non-+Inf bucket for performance/anti-denial-of-service reasons in a way that loses granularity but is still a valid Histogram.
##### Classic buckets

<!---
# EDITOR’S NOTE: The second sentence is a consideration, it can be moved if needed
-->
Every classic bucket MUST have a threshold. Classic bucket thresholds within a MetricPoint MUST be unique.

A classic bucket MUST cover every measured value less or equal to its threshold, or to put it another way, classic buckets MUST be cumulative. Classic buckets are cumulative to allow monitoring systems to drop any non-+Inf bucket for performance/anti-denial-of-service reasons in a way that loses granularity but is still a valid Histogram.

As an example for a metric representing request latency in seconds its values for classic buckets with thresholds 1, 2, 3, and +Inf MUST follow value_1 <= value_2 <= value_3 <= value_+Inf. If ten requests took 1 second each, the values of the 1, 2, 3, and +Inf buckets MUST equal 10.

Histogram MetricPoints with classic buckets MUST have one classic bucket with a +Inf threshold. The +Inf bucket counts all requests.

The Count value MUST equal the value of the +Inf bucket.

Negative threshold classic buckets MAY be used.

Classic bucket values MAY have exemplars. The value of the exemplar MUST be within the classic bucket. Exemplars SHOULD be put into the classic bucket with the highest threshold. A classic bucket MUST NOT have more than one exemplar.

##### Native buckets

Histogram MetricPoints with native buckets MUST have a Schema value. The Schema is an 8 bit signed integer between -4 and 127. Schema values between -4 and 127 are also called Standard Schemas.

For any Standard Schema n, the Histogram MetricPoint MAY contain positive, negative native buckets and possibly a single zero native bucket. It is valid to have no native buckets at all.

The boundaries of a positive or negative native bucket with index i MUST BE calculated as follows (using Python syntax):

The upper inclusive limit of a positive native bucket: `(2**2**-n)**i`

The lower exclusive limit of a positive native bucket: `(2**2**-n)**(i-1)`

The lower inclusive limit of a negative native bucket: `-((2**2**-n)**i)`

The upper exclusive limit of a negative native bucket: `-((2**2**-n)**(i-1))`

i is an integer number that MAY be negative.

There are exceptions to the rules above concerning the largest and smallest finite values representable as a float64 (called MaxFloat64 and MinFloat64 in the following) and the positive and negative infinity values (+Inf and -Inf):

The positive native bucket that contains MaxFloat64 (according to the boundary formulas above) has an upper inclusive limit of MaxFloat64 (rather than the limit calculated by the formulas above, which would overflow float64).

The next positive native bucket (index i+1 relative to the bucket from the previous item) has a lower exclusive limit of MaxFloat64 and an upper inclusive limit of +Inf. (It could be called a positive native overflow bucket.)

The negative native bucket that contains MinFloat64 (according to the boundary formulas above) has a lower inclusive limit of MinFloat64 (rather than the limit calculated by the formulas above, which would underflow float64).

The next negative native bucket (index i+1 relative to the bucket from the previous item) has an upper exclusive limit of MinFloat64 and an lower inclusive limit of -Inf. (It could be called a negative native overflow bucket.)

Native buckets beyond the +Inf and -Inf buckets described above MUST NOT be used.

If the zero native bucket is present, the Historam MetricPoint MUST have a Zero threshold. The Zero threshold is a non-negative float64 value (threshold >= 0.0). The boundaries of the Zero native bucket are `[-threshold, threshold]` inclusive.

If the zero native bucket is present, any measured value that falls into the zero native bucket MUST BE counted towards the Zero bucket and MUST NOT be counted in any other native bucket. The Zero threshold SHOULD be equal to a lower limit of an arbitraty Standard native bucket.

The Count value MUST equal the sum of the values of the positive, negative and the zero bucket.

A Histogram MetricPoint with native buckets MAY contain exemplars.

Exemplars associated with a Histogram MetricPoint with native buckets MUST have a timestamp.

The values of exemplars in a Histogram MetricPoint with native buckets MUST fall into one of the native buckets.

Each bucket covers the values less and or equal to it, and the value of the exemplar MUST be within this range. Exemplars SHOULD be put into the bucket with the highest value. A bucket MUST NOT have more than one exemplar.
The values of exemplars in a Histogram MetricPoint with native buckets SHOULD be evenly distributed to avoid only representing the bucket with the highest value and therefore most common case.

#### GaugeHistogram

GaugeHistograms measure current distributions. Common examples are how long items have been waiting in a queue, or size of the requests in a queue.

A GaugeHistogram MetricPoint MUST have one bucket with an +Inf threshold, and SHOULD contain a Gsum value. Every bucket MUST have a threshold and a value.
A GaugeHistogram MetricPoint MUST contain either classic buckets or exponential buckets or both.

The buckets for a GaugeHistogram follow all the same rules as for a Histogram.
A GaugeHistogram MetricPoint SHOULD contain Gcount, Gsum. Every bucket MUST have well defined boundaries and a value. Boundaries of a bucket MUST NOT be NaN. Gcount and bucket values MUST be integers.

The bucket and Gsum of a GaugeHistogram are conceptually gauges, however bucket values MUST NOT be negative or NaN. If negative threshold buckets are present, then sum MAY be negative. Gsum MUST NOT be NaN. Bucket values MUST be integers.
The bucket and Gsum of a GaugeHistogram are conceptually gauges, however bucket values MUST NOT be negative or NaN. If negative threshold buckets are present, then Gsum MAY be negative. Gsum MUST NOT be NaN. Bucket values MUST be integers.

A GaugeHistogram's Metric's LabelSet MUST NOT have a "le" label name.

Bucket values can have exemplars.
The buckets for a GaugeHistogram follow all the same rules as for a Histogram, with Gcount playing the same role as Count.

Each bucket covers the values less and or equal to it, and the value of the exemplar MUST be within this range. Exemplars SHOULD be put into the bucket with the highest value. A bucket MUST NOT have more than one exemplar.
The exemplars for a GaugeHistogram follow all the same rules as for a Histogram.

#### Summary

Expand Down