Skip to content

Commit f2c7d9c

Browse files
committed
docs: add Metrics feature documentation
1 parent 2378e46 commit f2c7d9c

File tree

1 file changed

+54
-9
lines changed

1 file changed

+54
-9
lines changed

docs/documentation/features.md

Lines changed: 54 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -611,15 +611,15 @@ Logging is enhanced with additional contextual information using
611611
[MDC](http://www.slf4j.org/manual.html#mdc). The following attributes are available in most
612612
parts of reconciliation logic and during the execution of the controller:
613613

614-
| MDC Key | Value added from primary resource |
615-
| :--- |:----------------------------------|
616-
| `resource.apiVersion` | `.apiVersion` |
617-
| `resource.kind` | `.kind` |
618-
| `resource.name` | `.metadata.name` |
619-
| `resource.namespace` | `.metadata.namespace` |
620-
| `resource.resourceVersion` | `.metadata.resourceVersion` |
621-
| `resource.generation` | `.metadata.generation` |
622-
| `resource.uid` | `.metadata.uid` |
614+
| MDC Key | Value added from primary resource |
615+
|:---------------------------|:----------------------------------|
616+
| `resource.apiVersion` | `.apiVersion` |
617+
| `resource.kind` | `.kind` |
618+
| `resource.name` | `.metadata.name` |
619+
| `resource.namespace` | `.metadata.namespace` |
620+
| `resource.resourceVersion` | `.metadata.resourceVersion` |
621+
| `resource.generation` | `.metadata.generation` |
622+
| `resource.uid` | `.metadata.uid` |
623623

624624
For more information about MDC see this [link](https://www.baeldung.com/mdc-in-log4j-2-logback).
625625

@@ -740,6 +740,51 @@ with a `mycrs` plural form will result in 2 files:
740740
> Quarkus users using the `quarkus-operator-sdk` extension do not need to add any extra dependency
741741
> to get their CRD generated as this is handled by the extension itself.
742742

743+
## Metrics
744+
745+
JOSDK provides built-in support for metrics reporting on what is happening with your reconcilers in the form of
746+
the `Metrics` interface which can be implemented to connect to your metrics provider of choice, JOSDK calling the
747+
methods as it goes about reconciling resources. By default, a no-operation implementation is provided thus providing a
748+
no-cost sane default. A [micrometer](https://micrometer.io)-based implementation is also provided.
749+
750+
You can use a different implementation by overriding the default one provided by the default `ConfigurationService`, as
751+
follows:
752+
753+
```java
754+
Metrics metrics= …;
755+
ConfigurationServiceProvider.overrideCurrent(overrider->overrider.withMetrics(metrics));
756+
```
757+
758+
### Micrometer implementation
759+
760+
The micrometer implementation records a lot of metrics associated to each resource handled by the operator by default.
761+
In order to be efficient, the implementation removes meters associated with resources when they are deleted. Since it
762+
might be useful to keep these metrics around for a bit before they are deleted, it is possible to configure a delay
763+
before their removal. As this is done asynchronously, it is also possible to configure how many threads you want to
764+
devote to these operations. Both aspects are controlled by the `MicrometerMetrics` constructor so changing the defaults
765+
is a matter of instantiating `MicrometerMetrics` with the desired values and tell `ConfigurationServiceProvider` about
766+
it as shown above.
767+
768+
The micrometer implementation records the following metrics:
769+
770+
| Meter name | Type | Tags | Description |
771+
|-----------------------------------------------------------|----------------|------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------|
772+
| operator.sdk.reconciliations.executions.<reconciler name> | gauge | group, version, kind | Number of executions of the named reconciler |
773+
| operator.sdk.reconciliations.queue.size.<reconciler name> | gauge | group, version, kind | How many resources are queued to get reconciled by named reconciler |
774+
| operator.sdk.<map name>.size | gauge map size | | Gauge tracking the size of a specified map (currently unused but could be used to monitor caches size) |
775+
| operator.sdk.events.received | counter | group, version, kind, name, namespace, scope, event, action | Number of received Kubernetes events |
776+
| operator.sdk.events.delete | counter | group, version, kind, name, namespace, scope | Number of received Kubernetes delete events |
777+
| operator.sdk.reconciliations.started | counter | group, version, kind, name, namespace, scope, reconciliations.retries.last, reconciliations.retries.number | Number of started reconciliations per resource type |
778+
| operator.sdk.reconciliations.failed | counter | group, version, kind, name, namespace, scope, exception | Number of failed reconciliations per resource type |
779+
| operator.sdk.reconciliations.success | counter | group, version, kind, name, namespace, scope | Number of successful reconciliations per resource type |
780+
| operator.sdk.controllers.execution.reconcile.success | counter | controller, type | Number of successful reconciliations per controller |
781+
| operator.sdk.controllers.execution.reconcile.failure | counter | controller, exception | Number of failed reconciliations per controller |
782+
| operator.sdk.controllers.execution.cleanup.success | counter | controller, type | Number of successful cleanups per controller |
783+
| operator.sdk.controllers.execution.cleanup.failure | counter | controller, exception | Number of failed cleanups per controller |
784+
785+
As you can see all the recorded metrics start with the `operator.sdk` prefix.
786+
787+
743788
## Optimizing Caches
744789

745790
One of the ideas around the operator pattern is that all the relevant resources are cached, thus reconciliation is

0 commit comments

Comments
 (0)