add shadowing metrics

paulohtb6 · paulohtb6 · commit 1905b944ba15 · 2025-12-02T17:53:43.000-03:00
diff --git a/modules/manage/pages/disaster-recovery/shadowing/monitor.adoc b/modules/manage/pages/disaster-recovery/shadowing/monitor.adoc
@@ -56,36 +56,34 @@ Shadowing provides comprehensive metrics to track replication performance and he
 |===
 |Metric |Type |Description
 
-|`redpanda_shadow_link_shadow_lag`
+|xref:reference:public-metrics-reference.adoc#redpanda_shadow_link_shadow_lag[`redpanda_shadow_link_shadow_lag`]
 |Gauge
 |The lag of the shadow partition against the source partition, calculated as source partition LSO (Last Stable Offset) minus shadow partition HWM (High Watermark). Monitor by `shadow_link_name`, `topic`, and `partition` to understand replication lag for each partition.
 
-|`redpanda_shadow_link_total_bytes_fetched`
-|Count
+|xref:reference:public-metrics-reference.adoc#redpanda_shadow_link_total_bytes_fetched[`redpanda_shadow_link_total_bytes_fetched`]
+|Counter
 |The total number of bytes fetched by a sharded replicator (bytes received by the client). Labeled by `shadow_link_name` and `shard` to track data transfer volume from the source cluster.
 
-|`redpanda_shadow_link_total_bytes_written`
-|Count
+|xref:reference:public-metrics-reference.adoc#redpanda_shadow_link_total_bytes_written[`redpanda_shadow_link_total_bytes_written`]
+|Counter
 |The total number of bytes written by a sharded replicator (bytes written to the write_at_offset_stm). Uses `shadow_link_name` and `shard` labels to monitor data written to the shadow cluster.
 
-|`redpanda_shadow_link_client_errors`
-|Count
-|The number of errors seen by the client. Track by `shadow_link_name` and `shard` to identify connection or protocol issues between clusters.
-
-|`redpanda_shadow_link_shadow_topic_state`
+|xref:reference:public-metrics-reference.adoc#redpanda_shadow_link_shadow_topic_state[`redpanda_shadow_link_shadow_topic_state`]
 |Gauge
 |Number of shadow topics in the respective states. Labeled by `shadow_link_name` and `state` to monitor topic state distribution across your shadow links.
 
-|`redpanda_shadow_link_total_records_fetched`
-|Count
+|xref:reference:public-metrics-reference.adoc#redpanda_shadow_link_total_records_fetched[`redpanda_shadow_link_total_records_fetched`]
+|Counter
 |The total number of records fetched by the sharded replicator (records received by the client). Monitor by `shadow_link_name` and `shard` to track message throughput from the source.
 
-|`redpanda_shadow_link_total_records_written`
-|Count
+|xref:reference:public-metrics-reference.adoc#redpanda_shadow_link_total_records_written[`redpanda_shadow_link_total_records_written`]
+|Counter
 |The total number of records written by a sharded replicator (records written to the write_at_offset_stm). Uses `shadow_link_name` and `shard` labels to monitor message throughput to the shadow cluster.
 |===
 
-See also: xref:reference:public-metrics-reference.adoc[]
+For detailed descriptions of each metric, including usage examples and label definitions, see xref:reference:public-metrics-reference.adoc#shadow-link-metrics[Shadow Link metrics reference].
+
+See also: xref:reference:public-metrics-reference.adoc#shadow-link-metrics[Shadow Link metrics reference]
 
 == Monitoring best practices
 
@@ -106,8 +104,7 @@ rpk shadow status <shadow-link-name> | grep -E "LAG|Lag"
 
 Configure monitoring alerts for following conditions, which indicate problems with Shadowing:
 
-* **High replication lag**: When `redpanda_shadow_link_shadow_lag` exceeds your RPO requirements
-* **Connection errors**: When `redpanda_shadow_link_client_errors` increases rapidly
+* **High replication lag**: When xref:reference:public-metrics-reference.adoc#redpanda_shadow_link_shadow_lag[`redpanda_shadow_link_shadow_lag`] exceeds your RPO requirements
 * **Topic state changes**: When topics move to `FAULTED` state
 * **Task failures**: When replication tasks enter `FAULTED` or `NOT_RUNNING` states
 * **Throughput drops**: When bytes/records fetched drops significantly
diff --git a/modules/reference/pages/public-metrics-reference.adoc b/modules/reference/pages/public-metrics-reference.adoc
@@ -2343,6 +2343,87 @@ Total number of bytes uploaded for the topic to object storage.
 - `redpanda_namespace`
 - `redpanda_topic`
 
+---
+
+== Shadow Link metrics
+
+=== redpanda_shadow_link_shadow_lag
+
+The lag of the shadow partition against the source partition, calculated as source partition LSO (Last Stable Offset) minus shadow partition HWM (High Watermark). Monitor this metric to understand replication lag for each partition and ensure your RPO requirements are being met.
+
+*Type*: gauge
+
+*Labels*:
+
+- `shadow_link_name` - Name of the shadow link
+- `topic` - Topic name
+- `partition` - Partition identifier
+
+---
+
+=== redpanda_shadow_link_shadow_topic_state
+
+Number of shadow topics in the respective states. Monitor this metric to track the health and status distribution of shadow topics across your shadow links.
+
+*Type*: gauge
+
+*Labels*:
+
+- `shadow_link_name` - Name of the shadow link
+- `state` - Topic state (active, failed, paused, failing_over, failed_over, promoting, promoted)
+
+---
+
+=== redpanda_shadow_link_total_bytes_fetched
+
+Total number of bytes fetched by a sharded replicator (bytes received by the client). Use this metric to track data transfer volume from the source cluster.
+
+*Type*: counter
+
+*Labels*:
+
+- `shadow_link_name` - Name of the shadow link
+- `shard` - Shard identifier
+
+---
+
+=== redpanda_shadow_link_total_bytes_written
+
+Total number of bytes written by a sharded replicator (bytes written to the write_at_offset_stm). Use this metric to monitor data written to the shadow cluster.
+
+*Type*: counter
+
+*Labels*:
+
+- `shadow_link_name` - Name of the shadow link
+- `shard` - Shard identifier
+
+---
+
+=== redpanda_shadow_link_total_records_fetched
+
+Total number of records fetched by the sharded replicator (records received by the client). Monitor this metric to track message throughput from the source cluster.
+
+*Type*: counter
+
+*Labels*:
+
+- `shadow_link_name` - Name of the shadow link
+- `shard` - Shard identifier
+
+---
+
+=== redpanda_shadow_link_total_records_written
+
+Total number of records written by a sharded replicator (records written to the write_at_offset_stm). Use this metric to monitor message throughput to the shadow cluster.
+
+*Type*: counter
+
+*Labels*:
+
+- `shadow_link_name` - Name of the shadow link
+- `shard` - Shard identifier
+
 == Related topics
 
 * xref:manage:monitoring.adoc[Learn how to monitor Redpanda]