redpanda-data · micheleRP · Dec 2, 2025 · Nov 25, 2025 · Nov 25, 2025 · Nov 25, 2025
@@ -200,7 +200,7 @@
 **** xref:manage:disaster-recovery/shadowing/overview.adoc[Overview]
 **** xref:manage:disaster-recovery/shadowing/setup.adoc[Configure Shadowing]
 **** xref:manage:disaster-recovery/shadowing/monitor.adoc[Monitor Shadowing]
-**** xref:manage:disaster-recovery/shadowing/failover.adoc[Configure Failover]
+**** xref:manage:disaster-recovery/shadowing/failover.adoc[Failover]
 **** xref:manage:disaster-recovery/shadowing/failover-runbook.adoc[Failover Runbook]
 *** xref:manage:disaster-recovery/whole-cluster-restore.adoc[Whole Cluster Restore]
 *** xref:manage:disaster-recovery/topic-recovery.adoc[Topic Recovery]

@@ -3,21 +3,28 @@
 :page-aliases: deploy:redpanda/manual/resilience/shadowing-guide.adoc, deploy:redpanda/manual/disaster-recovery/shadowing/failover-runbook.adoc
 :env-linux: true
 :page-categories: Management, High Availability, Disaster Recovery, Emergency Response
+// tag::single-source[]
 
+
+ifndef::env-cloud[]
 [NOTE]
 ====
 include::shared:partial$enterprise-license.adoc[]
 ====
+endif::[]
 
 This guide provides step-by-step procedures for emergency failover when your primary Redpanda cluster becomes unavailable. Follow these procedures only during active disasters when immediate failover is required.
-
 // TODO: All command output examples in this guide need verification by running actual commands in test environment
 
 [IMPORTANT]
 ====
 This is an emergency procedure. For planned failover testing or day-to-day shadow link management, see xref:./failover.adoc[]. Ensure you have completed the xref:manage:disaster-recovery/shadowing/overview.adoc#disaster-readiness-checklist[disaster readiness checklist] before an emergency occurs.
 ====
 
+ifdef::env-cloud[]
+NOTE: Shadowing is supported on BYOC and Dedicated clusters running Redpanda version 25.3 and later. 
+endif::[]
+
 == Emergency failover procedure
 
 Follow these steps during an active disaster:
@@ -86,7 +93,7 @@ Verify that the following conditions exist before proceeding with failover:
 * Topics should be in `ACTIVE` state (not `FAULTED`).
 * Replication lag should be reasonable for your RPO requirements.
 
-**Understanding replication lag:**
+==== Understanding replication lag
 
 Use xref:reference:rpk/rpk-shadow/rpk-shadow-status.adoc[`rpk shadow status`] to check lag, which shows the message count difference between source and shadow partitions:
 
@@ -128,11 +135,11 @@ Name: <topic-name>, State: ACTIVE
  1          2345     2579     2568     11
 ----
 
-The partition information shows:
+The partition information shows the following:
 
-* **SRC_LSO**: Source partition Last Stable Offset
-* **SRC_HWM**: Source partition High Watermark  
-* **DST_HWM**: Shadow (destination) partition High Watermark
+* **SRC_LSO**: Source partition last stable offset
+* **SRC_HWM**: Source partition high watermark  
+* **DST_HWM**: Shadow (destination) partition high watermark
 * **Lag**: Message count difference between source and shadow partitions
 
 [IMPORTANT]
@@ -290,4 +297,6 @@ After successful failover, focus on recovery planning and process improvement. B
 1. **Document the incident**: Record timeline, impact, and lessons learned
 2. **Update runbooks**: Improve procedures based on what you learned
 3. **Test regularly**: Schedule regular disaster recovery drills
-4. **Review monitoring**: Ensure monitoring caught the issue appropriately
+4. **Review monitoring**: Ensure monitoring caught the issue appropriately
+
+// end::single-source[]
@@ -1,17 +1,32 @@
-= Configure Failover
+= Failover
 :description: Learn how failover can transform shadow topics into fully writable resources during disasters.
 :page-categories: Management, High Availability, Disaster Recovery
 :page-aliases: deploy:redpanda/manual/disaster-recovery/shadowing/failover.adoc
+// tag::single-source[]
 
+ifndef::env-cloud[]
 [NOTE]
 ====
 include::shared:partial$enterprise-license.adoc[]
 ====
+endif::[]
 
 Failover is the process of modifying shadow topics or an entire shadow cluster from read-only replicas to fully writable resources, and ceasing replication from the source cluster. You can fail over individual topics for selective workload migration or fail over the entire cluster for comprehensive disaster recovery. This critical operation transforms your shadow resources into operational production assets, allowing you to redirect application traffic when the source cluster becomes unavailable.
 
+ifdef::env-cloud[]
+You can failover a shadow link using the Redpanda Cloud UI, `rpk`, or the Data Plane API. 
+endif::[]
+
+ifndef::env-cloud[]
+You can failover a shadow link using Redpanda Console, `rpk`, or the Admin API. 
+endif::[]
+
 include::shared:partial$emergency-shadowing-callout.adoc[]
 
+ifdef::env-cloud[]
+NOTE: Shadowing is supported on BYOC and Dedicated clusters running Redpanda version 25.3 and later. 
+endif::[]
+
 == Failover behavior
 
 When you initiate failover, Redpanda performs the following operations:
@@ -22,13 +37,15 @@ When you initiate failover, Redpanda performs the following operations:
 
 Topic failover is irreversible. Once failed over, topics cannot return to shadow mode, and automatic fallback to the original source cluster is not supported.
 
+NOTE: To avoid a split-brain scenario after failover, ensure that all clients are reconfigured to point to the shadow cluster before resuming write activity. 
+
 == Failover commands
 
 You can perform failover at different levels of granularity to match your disaster recovery needs:
 
 === Individual topic failover
 
-To fail over a specific shadow topic while leaving other topics in the shadow link still replicating:
+To fail over a specific shadow topic while leaving other topics in the shadow link still replicating, run:
 
 [,bash]
 ----
@@ -39,7 +56,7 @@ Use this approach when you need to selectively failover specific workloads or wh
 
 === Complete shadow link failover (cluster failover)
 
-To fail over all shadow topics associated with the shadow link simultaneously:
+To fail over all shadow topics associated with the shadow link simultaneously, run:
 
 [,bash]
 ----
@@ -67,6 +84,7 @@ Force deleting a shadow link is irreversible and immediately fails over all topi
 The shadow link itself has a simple state model:
 
 * **`ACTIVE`**: Shadow link is operating normally, replicating data
+* **`PAUSED`**: Shadow link replication is temporarily halted by user action
 
 Shadow links do not have dedicated failover states. Instead, the link's operational status is determined by the collective state of its shadow topics.
 
@@ -78,10 +96,11 @@ Individual shadow topics progress through specific states during failover:
 * **`FAULTED`**: Shadow topic has encountered an error and is not replicating
 * **`FAILING_OVER`**: Failover initiated, replication stopping
 * **`FAILED_OVER`**: Failover completed successfully, topic fully writable
+* **`PAUSED`**: Replication temporarily halted by user action
 
 == Monitor failover progress
 
-Monitor failover progress using the status command:
+To monitor failover progress using the status command, run:
 
 [,bash]
 ----
@@ -90,7 +109,7 @@ rpk shadow status <shadow-link-name>
 
 The output shows individual topic states and any issues encountered during the failover process. For detailed command options, see xref:reference:rpk/rpk-shadow/rpk-shadow-status.adoc[`rpk shadow status`].
 
-**Task states during monitoring:**
+Task states during monitoring:
 
 * **`ACTIVE`**: Task is operating normally and replicating data
 * **`FAULTED`**: Task encountered an error and requires attention
@@ -125,6 +144,8 @@ After successful failover, your shadow cluster exhibits the following characteri
 
 == Failover considerations and limitations
 
+Before implementing failover procedures, understand these key considerations that affect your disaster recovery strategy and operational planning.
+
 **Data consistency:**
 
 * Some data loss may occur due to replication lag at the time of failover.
@@ -151,4 +172,6 @@ After completing failover:
 * Verify that applications can produce and consume messages normally
 * Consider deleting the shadow link if failover was successful and permanent
 
-For emergency situations, see xref:./failover-runbook.adoc[Failover Runbook].
+For emergency situations, see xref:./failover-runbook.adoc[Failover Runbook].
+
+// end::single-source[]
@@ -2,63 +2,55 @@
 :description: Monitor Shadowing health with status commands, metrics, and best practices for tracking replication performance.
 :page-categories: Management, Monitoring, Disaster Recovery
 :page-aliases: deploy:redpanda/manual/disaster-recovery/shadowing/monitor.adoc
+// tag::single-source[]
 
+ifndef::env-cloud[]
 [NOTE]
 ====
 include::shared:partial$enterprise-license.adoc[]
 ====
+endif::[]
 
 Monitor your shadow links to ensure proper replication performance and understand your disaster recovery readiness. Use `rpk` commands, metrics, and status information to track shadow link health and troubleshoot issues.
 
 include::shared:partial$emergency-shadowing-callout.adoc[]
 
 == Status commands
 
-List existing shadow links:
+To list existing shadow links, run:
 
 [,bash]
 ----
 rpk shadow list
 ----
 
-View shadow link configuration details:
+To view shadow link configuration details, run:
 
 [,bash]
 ----
 rpk shadow describe <my-disaster-recovery-link>
 ----
 
-For detailed command options, see xref:reference:rpk/rpk-shadow/rpk-shadow-list.adoc[`rpk shadow list`] and xref:reference:rpk/rpk-shadow/rpk-shadow-describe.adoc[`rpk shadow describe`].
+For detailed command options, see xref:reference:rpk/rpk-shadow/rpk-shadow-list.adoc[`rpk shadow list`] and xref:reference:rpk/rpk-shadow/rpk-shadow-describe.adoc[`rpk shadow describe`]. This command shows the complete configuration of the shadow link, including connection settings, filters, and synchronization options.
 
-This command shows the complete configuration of the shadow link, including connection settings, filters, and synchronization options.
-
-Check your shadow link status to ensure proper operation:
+To check your shadow link status and ensure proper operation, run:
 
 [,bash]
 ----
 rpk shadow status <shadow-link-name>
 ----
 
-**Status command options:**
+For troubleshooting specific issues, you can use command options to show individual status sections. See xref:reference:rpk/rpk-shadow/rpk-shadow-status.adoc[`rpk shadow status`] for available status options. The status output includes the following:
 
-[,bash]
-----
-rpk shadow status <shadow-link-name>
-----
-
-For troubleshooting specific issues, you can use command options to show individual status sections. See xref:reference:rpk/rpk-shadow/rpk-shadow-status.adoc[`rpk shadow status`] for available status options.
-
-The status output includes:
-
-* **Shadow link state**: Overall operational state (`ACTIVE`)
-* **Individual topic states**: Current state of each replicated topic (`ACTIVE`, `FAULTED`, `FAILING_OVER`, `FAILED_OVER`)
+* **Shadow link state**: Overall operational state (`ACTIVE`, `PAUSED`).
+* **Individual topic states**: Current state of each replicated topic (`ACTIVE`, `FAULTED`, `FAILING_OVER`, `FAILED_OVER`, `PAUSED`).
 * **Task status**: Health of replication tasks across brokers (`ACTIVE`, `FAULTED`, `NOT_RUNNING`, `LINK_UNAVAILABLE`). For details about shadow link tasks, see xref:manage:disaster-recovery/shadowing/setup.adoc#shadow-link-tasks[Shadow link tasks].
-* **Lag information**: Replication lag per partition showing source vs shadow high watermarks (HWM)
+* **Lag information**: Replication lag per partition showing source vs shadow high watermarks (HWM).
 
 [[shadow-link-metrics]]
 == Metrics
 
-Shadowing provides comprehensive metrics to track replication performance and health:
+Shadowing provides comprehensive metrics to track replication performance and health with the xref:reference:public-metrics-reference.adoc[`public_metrics`] endpoint.
 
 [cols="1,1,2"]
 |===
@@ -110,9 +102,9 @@ rpk shadow list | grep -v "ACTIVE" || echo "All shadow links healthy"
 rpk shadow status <shadow-link-name> | grep -E "LAG|Lag"
 ----
 
-=== Alert thresholds
+=== Alert conditions
 
-Configure monitoring alerts for:
+Configure monitoring alerts for the following conditions, which indicate problems with Shadowing:
 
 * **High replication lag**: When `redpanda_shadow_link_shadow_lag` exceeds your RPO requirements
 * **Connection errors**: When `redpanda_shadow_link_client_errors` increases rapidly
@@ -122,3 +114,5 @@ Configure monitoring alerts for:
 * **Link unavailability**: When tasks show `LINK_UNAVAILABLE` indicating source cluster connectivity issues
 +
 For more information about shadow link tasks and their states, see xref:manage:disaster-recovery/shadowing/setup.adoc#shadow-link-tasks[Shadow link tasks].
+
+// end::single-source[]