Skip to content

Commit d9c27e2

Browse files
committed
docs: add Metrics feature documentation
1 parent 1fa7805 commit d9c27e2

File tree

1 file changed

+45
-0
lines changed

1 file changed

+45
-0
lines changed

docs/documentation/features.md

+45
Original file line numberDiff line numberDiff line change
@@ -757,6 +757,51 @@ with a `mycrs` plural form will result in 2 files:
757757
> Quarkus users using the `quarkus-operator-sdk` extension do not need to add any extra dependency
758758
> to get their CRD generated as this is handled by the extension itself.
759759

760+
## Metrics
761+
762+
JOSDK provides built-in support for metrics reporting on what is happening with your reconcilers in the form of
763+
the `Metrics` interface which can be implemented to connect to your metrics provider of choice, JOSDK calling the
764+
methods as it goes about reconciling resources. By default, a no-operation implementation is provided thus providing a
765+
no-cost sane default. A [micrometer](https://micrometer.io)-based implementation is also provided.
766+
767+
You can use a different implementation by overriding the default one provided by the default `ConfigurationService`, as
768+
follows:
769+
770+
```java
771+
Metrics metrics= …;
772+
ConfigurationServiceProvider.overrideCurrent(overrider->overrider.withMetrics(metrics));
773+
```
774+
775+
### Micrometer implementation
776+
777+
The micrometer implementation records a lot of metrics associated to each resource handled by the operator by default.
778+
In order to be efficient, the implementation removes meters associated with resources when they are deleted. Since it
779+
might be useful to keep these metrics around for a bit before they are deleted, it is possible to configure a delay
780+
before their removal. As this is done asynchronously, it is also possible to configure how many threads you want to
781+
devote to these operations. Both aspects are controlled by the `MicrometerMetrics` constructor so changing the defaults
782+
is a matter of instantiating `MicrometerMetrics` with the desired values and tell `ConfigurationServiceProvider` about
783+
it as shown above.
784+
785+
The micrometer implementation records the following metrics:
786+
787+
| Meter name | Type | Tags | Description |
788+
|-----------------------------------------------------------|----------------|------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------|
789+
| operator.sdk.reconciliations.executions.<reconciler name> | gauge | group, version, kind | Number of executions of the named reconciler |
790+
| operator.sdk.reconciliations.queue.size.<reconciler name> | gauge | group, version, kind | How many resources are queued to get reconciled by named reconciler |
791+
| operator.sdk.<map name>.size | gauge map size | | Gauge tracking the size of a specified map (currently unused but could be used to monitor caches size) |
792+
| operator.sdk.events.received | counter | group, version, kind, name, namespace, scope, event, action | Number of received Kubernetes events |
793+
| operator.sdk.events.delete | counter | group, version, kind, name, namespace, scope | Number of received Kubernetes delete events |
794+
| operator.sdk.reconciliations.started | counter | group, version, kind, name, namespace, scope, reconciliations.retries.last, reconciliations.retries.number | Number of started reconciliations per resource type |
795+
| operator.sdk.reconciliations.failed | counter | group, version, kind, name, namespace, scope, exception | Number of failed reconciliations per resource type |
796+
| operator.sdk.reconciliations.success | counter | group, version, kind, name, namespace, scope | Number of successful reconciliations per resource type |
797+
| operator.sdk.controllers.execution.reconcile.success | counter | controller, type | Number of successful reconciliations per controller |
798+
| operator.sdk.controllers.execution.reconcile.failure | counter | controller, exception | Number of failed reconciliations per controller |
799+
| operator.sdk.controllers.execution.cleanup.success | counter | controller, type | Number of successful cleanups per controller |
800+
| operator.sdk.controllers.execution.cleanup.failure | counter | controller, exception | Number of failed cleanups per controller |
801+
802+
As you can see all the recorded metrics start with the `operator.sdk` prefix.
803+
804+
760805
## Optimizing Caches
761806

762807
One of the ideas around the operator pattern is that all the relevant resources are cached, thus reconciliation is

0 commit comments

Comments
 (0)