diff --git a/src/current/v24.3/create-logical-replication-stream.md b/src/current/v24.3/create-logical-replication-stream.md index 3e7fb81a8e6..8b47ede754e 100644 --- a/src/current/v24.3/create-logical-replication-stream.md +++ b/src/current/v24.3/create-logical-replication-stream.md @@ -50,14 +50,6 @@ Option | Description `cursor` | Emits any changes after the specified timestamp. LDR will not perform an initial backfill with the `cursor` option, it will stream any changes after the specified timestamp. The LDR job will encounter an error if you specify a `cursor` timestamp that is before the configured [garbage collection]({% link {{ page.version.version }}/architecture/storage-layer.md %}#garbage-collection) window for that table. **Warning:** Apply the `cursor` option carefully to LDR streams. Using a timestamp in error could cause data loss. `discard` | ([**Unidirectional LDR only**]({% link {{ page.version.version }}/logical-data-replication-overview.md %}#use-cases)) Ignore [TTL deletes]({% link {{ page.version.version }}/row-level-ttl.md %}) in an LDR stream with `discard = ttl-deletes`. **Note**: To ignore row-level TTL deletes in an LDR stream, it is necessary to set the [`ttl_disable_changefeed_replication`]({% link {{ page.version.version }}/row-level-ttl.md %}#ttl-storage-parameters) storage parameter on the source table. Refer to the [Ignore row-level TTL deletes](#ignore-row-level-ttl-deletes) example. `label` | Tracks LDR metrics at the job level. Add a user-specified string with `label`. Refer to [Metrics labels]({% link {{ page.version.version }}/logical-data-replication-monitoring.md %}#metrics-labels). -`mode` | Determines how LDR replicates the data to the destination cluster. Possible values: `immediate`, `validated`. For more details, refer to [LDR modes](#ldr-modes). - -## LDR modes - -_Modes_ determine how LDR replicates the data to the destination cluster. There are two modes: - -- `immediate` (default): {% include {{ page.version.version }}/ldr/immediate-description.md %} -- `validated`: {% include {{ page.version.version }}/ldr/validated-description.md %} ## Bidirectional LDR diff --git a/src/current/v24.3/logical-data-replication-overview.md b/src/current/v24.3/logical-data-replication-overview.md index 7495991c39f..7a5219f6c46 100644 --- a/src/current/v24.3/logical-data-replication-overview.md +++ b/src/current/v24.3/logical-data-replication-overview.md @@ -44,7 +44,6 @@ Isolate critical application workloads from non-critical application workloads. - **Table-level replication**: When you initiate LDR, it will replicate all of the source table's existing data to the destination table. From then on, LDR will replicate the source table's data to the destination table to achieve eventual consistency. - **Last write wins conflict resolution**: LDR uses [_last write wins (LWW)_ conflict resolution]({% link {{ page.version.version }}/manage-logical-data-replication.md %}#conflict-resolution), which will use the latest [MVCC]({% link {{ page.version.version }}/architecture/storage-layer.md %}#mvcc) timestamp to resolve a conflict in row insertion. - **Dead letter queue (DLQ)**: When LDR starts, the job will create a [DLQ table]({% link {{ page.version.version }}/manage-logical-data-replication.md %}#dead-letter-queue-dlq) with each replicating table in order to track unresolved conflicts. You can interact and manage this table like any other SQL table. -- **Replication modes**: LDR offers different [_modes_]({% link {{ page.version.version }}/create-logical-replication-stream.md %}#ldr-modes) that apply data differently during replication, which allows you to consider optimizing for throughput or constraints during replication. - **Monitoring**: To [monitor]({% link {{ page.version.version }}/logical-data-replication-monitoring.md %}) LDR's initial progress, current status, and performance, you can view metrics available in the DB Console, Prometheus, and Metrics Export. ## Get started diff --git a/src/current/v24.3/manage-logical-data-replication.md b/src/current/v24.3/manage-logical-data-replication.md index eb43b0a1932..a2359febba5 100644 --- a/src/current/v24.3/manage-logical-data-replication.md +++ b/src/current/v24.3/manage-logical-data-replication.md @@ -22,7 +22,7 @@ In LDR, conflicts are detected at both the [KV]({% link {{ page.version.version ### KV level conflicts -LDR uses _last write wins (LWW)_ conflict resolution based on the [MVCC timestamp]({% link {{ page.version.version }}/architecture/storage-layer.md %}#mvcc) of the replicating write. LDR will resolve conflicts by inserting the row with the latest MVCC timestamp. Conflicts at the KV level are detected in both `immediate` and `validated` mode. +LDR uses _last write wins (LWW)_ conflict resolution based on the [MVCC timestamp]({% link {{ page.version.version }}/architecture/storage-layer.md %}#mvcc) of the replicating write. LDR will resolve conflicts by inserting the row with the latest MVCC timestamp. Conflicts at the KV level are detected when there is either: @@ -31,20 +31,14 @@ Conflicts at the KV level are detected when there is either: ### SQL level conflicts -In `validated` mode, when a conflict cannot apply due to violating [constraints]({% link {{ page.version.version }}/constraints.md %}), for example, a foreign key constraint or schema constraint, it will be retried for up to a minute and then put in the [DLQ](#dead-letter-queue-dlq) if it could not be resolved. +When a conflict cannot apply due to violating [constraints]({% link {{ page.version.version }}/set-up-logical-data-replication.md %}#schema-validation), for example, a schema constraint, LDR will send the row to the [DLQ](#dead-letter-queue-dlq). ### Dead letter queue (DLQ) -When the LDR job starts, it will create a DLQ table with each replicating table so that unresolved conflicts can be tracked. The DLQ will contain the writes that LDR cannot apply after the retry period, which could occur if: +When the LDR job starts, it will create a DLQ table with each replicating table so that unresolved conflicts can be tracked. The DLQ will contain the writes that LDR cannot apply after the retry period of a minute, which could occur if: -- The destination table was dropped. -- The destination cluster is unavailable. -- Tables schemas do not match. - -In `validated` mode, rows are also sent to the DLQ when: - -- [Foreign key]({% link {{ page.version.version }}/foreign-key.md %}) dependencies are not met where there are foreign key constraints in the schema. -- Unique indexes and other constraints are not met. +- [Loss of quorum]({% link {{ page.version.version }}/architecture/replication-layer.md %}#overview) of the underlying [ranges]({% link {{ page.version.version }}/architecture/reads-and-writes-overview.md %}#range) in the destination table. +- There is a unique index on the destination table (for more details, refer to [Unique seconday indexes]({% link {{ page.version.version }}/set-up-logical-data-replication.md %}#unique-secondary-indexes)). {{site.data.alerts.callout_info}} LDR will not pause when the writes are sent to the DLQ, you must manage the DLQ manually. diff --git a/src/current/v24.3/set-up-logical-data-replication.md b/src/current/v24.3/set-up-logical-data-replication.md index f8b9baf6ac1..b046249eee9 100644 --- a/src/current/v24.3/set-up-logical-data-replication.md +++ b/src/current/v24.3/set-up-logical-data-replication.md @@ -22,7 +22,7 @@ If you're setting up bidirectional LDR, both clusters will act as a source and a 1. Prepare the tables on each cluster with the prerequisites for starting LDR. 1. Set up an [external connection]({% link {{ page.version.version }}/create-external-connection.md %}) on cluster B (which will be the destination cluster initially) to hold the connection URI for cluster A. -1. Start LDR from cluster B with your required modes. +1. Start LDR from cluster B with your required options. 1. (Optional) Run Steps 1 to 3 again with cluster B as the source and A as the destination, which starts LDR streaming from cluster B to A. 1. Check the status of the LDR job in the [DB Console]({% link {{ page.version.version }}/ui-overview.md %}). @@ -36,10 +36,6 @@ You'll need: - All nodes in each cluster will need access to the Certificate Authority for the other cluster. Refer to [Step 2. Connect from the destination to the source](#step-2-connect-from-the-destination-to-the-source). - LDR replicates at the table level, which means clusters can contain other tables that are not part of the LDR job. If both clusters are empty, create the tables that you need to replicate with **identical** schema definitions (excluding indexes) on both clusters. If one cluster already has an existing table that you'll replicate, ensure the other cluster's table definition matches. For more details on the supported schemas, refer to [Schema Validation](#schema-validation). -{% comment %}To add later, after further dev work{{site.data.alerts.callout_info}} -If you need to run LDR through a load balancer, use the load balancer IP address as the SQL advertise address on each cluster. It is important to note that using a load balancer with LDR can impair performance. -{{site.data.alerts.end}}{% endcomment %} - To create bidirectional LDR, you can complete the [optional step](#step-4-optional-set-up-bidirectional-ldr) to start the second LDR job that sends writes from the table on cluster B to the table on cluster A. ### Schema validation @@ -52,10 +48,60 @@ You cannot use LDR on a table with a schema that contains the following: - [Partial indexes]({% link {{ page.version.version }}/partial-indexes.md %}) and [hash-sharded indexes]({% link {{ page.version.version }}/hash-sharded-indexes.md %}) - Indexes with a [virtual computed column]({% link {{ page.version.version }}/computed-columns.md %}) - Composite types in the [primary key]({% link {{ page.version.version }}/primary-key.md %}) +- [Foreign key]({% link {{ page.version.version }}/foreign-key.md %}) dependencies For more details, refer to the LDR [Known limitations]({% link {{ page.version.version }}/logical-data-replication-overview.md %}#known-limitations). -When you run LDR in [`immediate` mode](#modes), you cannot replicate a table with [foreign key constraints]({% link {{ page.version.version }}/foreign-key.md %}). In [`validated` mode](#modes), foreign key constraints **must** match. All constraints are enforced at the time of SQL/application write. +LDR does not support replicating a table with [foreign key constraints]({% link {{ page.version.version }}/foreign-key.md %}). + +#### Unique secondary indexes + +When the destination table includes unique [secondary indexes]({% link {{ page.version.version }}/schema-design-indexes.md %}), it can cause rows to enter the [_dead letter queue_ (DLQ)]({% link {{ page.version.version }}/manage-logical-data-replication.md %}). The two clusters in LDR operate independently, so writes to one cluster can conflict with writes to the other. + +If the application modifies the same row in both clusters, LDR resolves the conflict using [_last write wins_ (LWW)]({% link {{ page.version.version }}/manage-logical-data-replication.md %}#conflict-resolution) conflict resolution. [`UNIQUE` constraints]({% link {{ page.version.version }}/unique.md %}) are validated locally in each cluster, therefore if a replicated write violates a `UNIQUE` constraint on the destination cluster (possibly because a conflicting write was already applied to the row) the replicating row will be applied to the DLQ. + +For example, consider a table with a unique `name` column where the following operations occur in this order in a source and destination cluster running LDR: + +On the **source cluster**: + +{% include_cached copy-clipboard.html %} +~~~ sql +-- writes to the source table +INSERT INTO city (1, nyc); -- timestamp 1 +UPDATE city SET name = 'philly' WHERE id = 1; -- timestamp 2 +INSERT INTO city (100, nyc); -- timestamp 3 +~~~ + +LDR replicates the write to the **destination cluster**: + +{% include_cached copy-clipboard.html %} +~~~ sql +-- replicates to the destination table +INSERT INTO city (100, nyc); -- timestamp 4 +~~~ + +_Timestamp 5:_ [Range]({% link {{ page.version.version }}/architecture/glossary.md %}#range) containing primary key `1` on the destination cluster is unavailable for a few minutes due to a [network partition]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#network-partition). + +_Timestamp 6:_ On the destination cluster, LDR attempts to replicate the row `(1, nyc)`, but it enters the retry queue for 1 minute due to the unavailable range. LDR adds `1, nyc` to the DLQ table after retrying and observing the `UNIQUE` constraint violation: + +{% include_cached copy-clipboard.html %} +~~~ sql +-- writes to the DLQ +INSERT INTO city (1, nyc); -- timestamp 6 +~~~ + +_Timestamp 7:_ LDR continues to replicate writes: + +{% include_cached copy-clipboard.html %} +~~~ sql +-- replicates to the destination table +INSERT INTO city (1, philly); -- timestamp 7 +~~~ + +To prevent expected DLQ entries and allow LDR to be eventually consistent, we recommend: + +- For **unidirectional** LDR, validate unique index constraints on the source cluster only. +- For **bidirectional** LDR, remove unique index constraints on both clusters. ## Step 1. Prepare the cluster @@ -117,11 +163,6 @@ You can use the `cockroach encode-uri` command to generate a connection string c In this step, you'll start the LDR job from the destination cluster. You can replicate one or multiple tables in a single LDR job. You cannot replicate system tables in LDR, which means that you must manually apply configurations and cluster settings, such as [row-level TTL]({% link {{ page.version.version }}/row-level-ttl.md %}) and user permissions on the destination cluster. -_Modes_ determine how LDR replicates the data to the destination cluster. There are two modes: - -- `immediate` (default): {% include {{ page.version.version }}/ldr/immediate-description.md %} -- `validated`: {% include {{ page.version.version }}/ldr/validated-description.md %} - 1. From the **destination** cluster, start LDR. Use the fully qualified table name for the source and destination tables: {% include_cached copy-clipboard.html %} @@ -129,8 +170,6 @@ In this step, you'll start the LDR job from the destination cluster. You can rep CREATE LOGICAL REPLICATION STREAM FROM TABLE {database.public.source_table_name} ON 'external://{source_external_connection}' INTO TABLE {database.public.destination_table_name}; ~~~ - You can change the default `mode` using the `WITH mode = validated` syntax. - If you would like to add multiple tables to the LDR job, ensure that the table name in the source table list and destination table list are in the same order: {% include_cached copy-clipboard.html %} diff --git a/src/current/v25.1/create-logical-replication-stream.md b/src/current/v25.1/create-logical-replication-stream.md index 12c60fc6b41..600c7ec2702 100644 --- a/src/current/v25.1/create-logical-replication-stream.md +++ b/src/current/v25.1/create-logical-replication-stream.md @@ -54,14 +54,6 @@ Option | Description `cursor` | Emits any changes after the specified timestamp. LDR will not perform an initial backfill with the `cursor` option, it will stream any changes after the specified timestamp. The LDR job will encounter an error if you specify a `cursor` timestamp that is before the configured [garbage collection]({% link {{ page.version.version }}/architecture/storage-layer.md %}#garbage-collection) window for that table. **Warning:** Apply the `cursor` option carefully to LDR streams. Using a timestamp in error could cause data loss. `discard` | ([**Unidirectional LDR only**]({% link {{ page.version.version }}/logical-data-replication-overview.md %}#use-cases)) Ignore [TTL deletes]({% link {{ page.version.version }}/row-level-ttl.md %}) in an LDR stream with `discard = ttl-deletes`. **Note**: To ignore row-level TTL deletes in an LDR stream, it is necessary to set the [`ttl_disable_changefeed_replication`]({% link {{ page.version.version }}/row-level-ttl.md %}#ttl-storage-parameters) storage parameter on the source table. Refer to the [Ignore row-level TTL deletes](#ignore-row-level-ttl-deletes) example. `label` | Tracks LDR metrics at the job level. Add a user-specified string with `label`. Refer to [Metrics labels]({% link {{ page.version.version }}/logical-data-replication-monitoring.md %}#metrics-labels). -`mode` | Determines how LDR replicates the data to the destination cluster. Possible values: `immediate`, `validated`. For more details, refer to [LDR modes](#ldr-modes). - -## LDR modes - -_Modes_ determine how LDR replicates the data to the destination cluster. There are two modes: - -- `immediate` (default): {% include {{ page.version.version }}/ldr/immediate-description.md %} -- `validated`: {% include {{ page.version.version }}/ldr/validated-description.md %} ## Bidirectional LDR diff --git a/src/current/v25.1/create-logically-replicated.md b/src/current/v25.1/create-logically-replicated.md index 3e7f3a616aa..8548e441903 100644 --- a/src/current/v25.1/create-logically-replicated.md +++ b/src/current/v25.1/create-logically-replicated.md @@ -55,14 +55,6 @@ Option | Description -------+------------ `bidirectional on` / `unidirectional` | (**Required**) Specifies whether the LDR stream will be unidirectional or bidirectional. With `bidirectional on` specified, LDR will set up two LDR streams between the clusters. Refer to the examples for [unidirectional](#unidirectional) and [bidirectional](#bidirectional). `label` | Tracks LDR metrics at the job level. Add a user-specified string with `label`. For more details, refer to [Metrics labels]({% link {{ page.version.version }}/logical-data-replication-monitoring.md %}#metrics-labels). -`mode` | Determines how LDR replicates the data to the destination cluster. Possible values: `immediate`, `validated`. For more details, refer to [LDR modes](#ldr-modes). - -## LDR modes - -_Modes_ determine how LDR replicates the data to the destination cluster. There are two modes: - -- `immediate` (default): {% include {{ page.version.version }}/ldr/immediate-description.md %} -- `validated`: {% include {{ page.version.version }}/ldr/validated-description.md %} ## Examples @@ -74,7 +66,7 @@ From the destination cluster of the LDR stream, run: {% include_cached copy-clipboard.html %} ~~~ sql -CREATE LOGICALLY REPLICATED TABLE {database.public.destination_table_name} FROM TABLE {database.public.source_table_name} ON 'external://source' WITH unidirectional, mode=validated; +CREATE LOGICALLY REPLICATED TABLE {database.public.destination_table_name} FROM TABLE {database.public.source_table_name} ON 'external://source' WITH unidirectional; ~~~ Include the following: diff --git a/src/current/v25.1/logical-data-replication-overview.md b/src/current/v25.1/logical-data-replication-overview.md index 0a54444c35b..5ea3520671a 100644 --- a/src/current/v25.1/logical-data-replication-overview.md +++ b/src/current/v25.1/logical-data-replication-overview.md @@ -44,7 +44,6 @@ Isolate critical application workloads from non-critical application workloads. - **Table-level replication**: When you initiate LDR, it will replicate all of the source table's existing data to the destination table. From then on, LDR will replicate the source table's data to the destination table to achieve eventual consistency. - **Last write wins conflict resolution**: LDR uses [_last write wins (LWW)_ conflict resolution]({% link {{ page.version.version }}/manage-logical-data-replication.md %}#conflict-resolution), which will use the latest [MVCC]({% link {{ page.version.version }}/architecture/storage-layer.md %}#mvcc) timestamp to resolve a conflict in row insertion. - **Dead letter queue (DLQ)**: When LDR starts, the job will create a [DLQ table]({% link {{ page.version.version }}/manage-logical-data-replication.md %}#dead-letter-queue-dlq) with each replicating table in order to track unresolved conflicts. You can interact and manage this table like any other SQL table. -- **Replication modes**: LDR offers different [_modes_]({% link {{ page.version.version }}/create-logical-replication-stream.md %}#ldr-modes) that apply data differently during replication, which allows you to consider optimizing for throughput or constraints during replication. - **Monitoring**: To [monitor]({% link {{ page.version.version }}/logical-data-replication-monitoring.md %}) LDR's initial progress, current status, and performance, you can view metrics available in the DB Console, Prometheus, and Metrics Export. ## Get started diff --git a/src/current/v25.1/manage-logical-data-replication.md b/src/current/v25.1/manage-logical-data-replication.md index 4256f9479eb..4afbd41c05d 100644 --- a/src/current/v25.1/manage-logical-data-replication.md +++ b/src/current/v25.1/manage-logical-data-replication.md @@ -22,7 +22,7 @@ In LDR, conflicts are detected at both the [KV]({% link {{ page.version.version ### KV level conflicts -LDR uses _last write wins (LWW)_ conflict resolution based on the [MVCC timestamp]({% link {{ page.version.version }}/architecture/storage-layer.md %}#mvcc) of the replicating write. LDR will resolve conflicts by inserting the row with the latest MVCC timestamp. Conflicts at the KV level are detected in both `immediate` and `validated` mode. +LDR uses _last write wins (LWW)_ conflict resolution based on the [MVCC timestamp]({% link {{ page.version.version }}/architecture/storage-layer.md %}#mvcc) of the replicating write. LDR will resolve conflicts by inserting the row with the latest MVCC timestamp. Conflicts at the KV level are detected when there is either: @@ -31,20 +31,11 @@ Conflicts at the KV level are detected when there is either: ### SQL level conflicts -In `validated` mode, when a conflict cannot apply due to violating [constraints]({% link {{ page.version.version }}/constraints.md %}), for example, a foreign key constraint or schema constraint, it will be retried for up to a minute and then put in the [DLQ](#dead-letter-queue-dlq) if it could not be resolved. +When a conflict cannot apply due to violating [constraints]({% link {{ page.version.version }}/set-up-logical-data-replication.md %}#schema-validation), for example, a schema constraint, LDR will send the row to the [DLQ](#dead-letter-queue-dlq). ### Dead letter queue (DLQ) -When the LDR job starts, it will create a DLQ table with each replicating table so that unresolved conflicts can be tracked. The DLQ will contain the writes that LDR cannot apply after the retry period, which could occur if: - -- The destination table was dropped. -- The destination cluster is unavailable. -- Tables schemas do not match. - -In `validated` mode, rows are also sent to the DLQ when: - -- [Foreign key]({% link {{ page.version.version }}/foreign-key.md %}) dependencies are not met where there are foreign key constraints in the schema. -- Unique indexes and other constraints are not met. +When the LDR job starts, it will create a DLQ table with each replicating table so that unresolved conflicts can be tracked. The DLQ will contain the writes that LDR cannot apply after the retry period of a minute, which could occur if there is a unique index on the destination table (for more details, refer to [Unique seconday indexes]({% link {{ page.version.version }}/set-up-logical-data-replication.md %}#unique-secondary-indexes)). {{site.data.alerts.callout_info}} LDR will not pause when the writes are sent to the DLQ, you must manage the DLQ manually. diff --git a/src/current/v25.1/set-up-logical-data-replication.md b/src/current/v25.1/set-up-logical-data-replication.md index fefb45cc362..e956a493a2d 100644 --- a/src/current/v25.1/set-up-logical-data-replication.md +++ b/src/current/v25.1/set-up-logical-data-replication.md @@ -29,7 +29,7 @@ If you're setting up bidirectional LDR, both clusters will act as a source and a 1. Prepare the clusters with the required settings, users, and privileges according to the LDR setup. 1. Set up [external connection(s)]({% link {{ page.version.version }}/create-external-connection.md %}) on the destination to hold the connection URI for the source. -1. Start LDR from the destination cluster with your required modes and syntax. +1. Start LDR from the destination cluster with your required syntax and options. 1. Check the status of the LDR job in the [DB Console]({% link {{ page.version.version }}/ui-overview.md %}). ## Before you begin @@ -52,15 +52,62 @@ You cannot use LDR on a table with a schema that contains: - [Partial indexes]({% link {{ page.version.version }}/partial-indexes.md %}) and [hash-sharded indexes]({% link {{ page.version.version }}/hash-sharded-indexes.md %}) - Indexes with a [virtual computed column]({% link {{ page.version.version }}/computed-columns.md %}) - Composite types in the [primary key]({% link {{ page.version.version }}/primary-key.md %}) +- [Foreign key]({% link {{ page.version.version }}/foreign-key.md %}) dependencies Additionally, for the `CREATE LOGICALLY REPLCATED` syntax, you cannot use LDR on a table with a schema that contains: - [User-defined types]({% link {{ page.version.version }}/enum.md %}) -- [Foreign key]({% link {{ page.version.version }}/foreign-key.md %}) dependencies For more details, refer to the LDR [Known limitations]({% link {{ page.version.version }}/logical-data-replication-overview.md %}#known-limitations). -When you run LDR in [`immediate` mode](#modes), you cannot replicate a table with [foreign key constraints]({% link {{ page.version.version }}/foreign-key.md %}). In [`validated` mode](#modes), foreign key constraints **must** match. All constraints are enforced at the time of SQL/application write. +#### Unique secondary indexes + +When the destination table includes unique [secondary indexes]({% link {{ page.version.version }}/schema-design-indexes.md %}), it can cause rows to enter the [_dead letter queue_ (DLQ)]({% link {{ page.version.version }}/manage-logical-data-replication.md %}). The two clusters in LDR operate independently, so writes to one cluster can conflict with writes to the other. + +If the application modifies the same row in both clusters, LDR resolves the conflict using [_last write wins_ (LWW)]({% link {{ page.version.version }}/manage-logical-data-replication.md %}#conflict-resolution) conflict resolution. [`UNIQUE` constraints]({% link {{ page.version.version }}/unique.md %}) are validated locally in each cluster, therefore if a replicated write violates a `UNIQUE` constraint on the destination cluster (possibly because a conflicting write was already applied to the row) the replicating row will be applied to the DLQ. + +For example, consider a table with a unique `name` column where the following operations occur in this order in a source and destination cluster running LDR: + +On the **source cluster**: + +{% include_cached copy-clipboard.html %} +~~~ sql +-- writes to the source table +INSERT INTO city (1, nyc); -- timestamp 1 +UPDATE city SET name = 'philly' WHERE id = 1; -- timestamp 2 +INSERT INTO city (100, nyc); -- timestamp 3 +~~~ + +LDR replicates the write to the **destination cluster**: + +{% include_cached copy-clipboard.html %} +~~~ sql +-- replicates to the destination table +INSERT INTO city (100, nyc); -- timestamp 4 +~~~ + +_Timestamp 5:_ [Range]({% link {{ page.version.version }}/architecture/glossary.md %}#range) containing primary key `1` on the destination cluster is unavailable for a few minutes due to a [network partition]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#network-partition). + +_Timestamp 6:_ On the destination cluster, LDR attempts to replicate the row `(1, nyc)`, but it enters the retry queue for 1 minute due to the unavailable range. LDR adds `1, nyc` to the DLQ table after retrying and observing the `UNIQUE` constraint violation: + +{% include_cached copy-clipboard.html %} +~~~ sql +-- writes to the DLQ +INSERT INTO city (1, nyc); -- timestamp 6 +~~~ + +_Timestamp 7:_ LDR continues to replicate writes: + +{% include_cached copy-clipboard.html %} +~~~ sql +-- replicates to the destination table +INSERT INTO city (1, philly); -- timestamp 7 +~~~ + +To prevent expected DLQ entries and allow LDR to be eventually consistent, we recommend: + +- For **unidirectional** LDR, validate unique index constraints on the source cluster only. +- For **bidirectional** LDR, remove unique index constraints on both clusters. ## Step 1. Prepare the cluster @@ -152,11 +199,6 @@ You can use the `cockroach encode-uri` command to generate a connection string c In this step, you'll start the LDR stream(s) from the destination cluster. You can replicate one or multiple tables in a single LDR job. You cannot replicate system tables in LDR, which means that you must manually apply configurations and cluster settings, such as [row-level TTL]({% link {{ page.version.version }}/row-level-ttl.md %}) and user permissions on the destination cluster. -_Modes_ determine how LDR replicates the data to the destination cluster. There are two modes: - -- `immediate` (default): {% include {{ page.version.version }}/ldr/immediate-description.md %} -- `validated`: {% include {{ page.version.version }}/ldr/validated-description.md %} - ### Syntax LDR streams can be started using one of the following SQL statements, depending on your requirements: @@ -207,8 +249,6 @@ Ensure you've created the table on the destination cluster with a matching schem CREATE LOGICAL REPLICATION STREAM FROM TABLE {database.public.source_table_name} ON 'external://{source_external_connection}' INTO TABLE {database.public.destination_table_name}; ~~~ -You can change the default `mode` using the `WITH mode = validated` syntax. - If you would like to add multiple tables to the LDR job, ensure that the table name in the source table list and destination table list are in the same order: {% include_cached copy-clipboard.html %} diff --git a/src/current/v25.2/cockroachdb-feature-availability.md b/src/current/v25.2/cockroachdb-feature-availability.md index f8a0074468a..a05d7798783 100644 --- a/src/current/v25.2/cockroachdb-feature-availability.md +++ b/src/current/v25.2/cockroachdb-feature-availability.md @@ -83,10 +83,6 @@ The [`VECTOR`]({% link {{ page.version.version }}/vector.md %}) data type stores [Organizing CockroachDB {{ site.data.products.cloud }} clusters using folders]({% link cockroachcloud/folders.md %}) is in preview. Folders allow you to organize and manage access to your clusters according to your organization's requirements. For example, you can create top-level folders for each business unit in your organization, and within those folders, organize clusters by geographic location and then by level of maturity, such as production, staging, and testing. -### Logical data replication (LDR) for CockroachDB {{ site.data.products.core }} - -**Logical data replication (LDR)** continuously replicates tables between active CockroachDB clusters. Both source and destination cluster can receive application reads and writes, with LDR enabling bidirectional replication for eventual consistency in the replicating tables. The active-active setup between clusters can provide protection against cluster, datacenter, or region failure while still achieving single-region low latency reads and writes in the individual CockroachDB clusters. Setting up LDR between a source and destination CockroachDB {{ site.data.products.core }} cluster is in preview. - ### Read on standby cluster in physical cluster replication (PCR) for CockroachDB {{ site.data.products.core }} The [`READ VIRTUAL CLUSTER`]({% link {{ page.version.version }}/create-virtual-cluster.md %}#options) option allows you to set up a PCR stream that also creates a read-only virtual cluster on the standby cluster. You can create a PCR job as per the [Set Up Physical Cluster Replication]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}) guide and then add the option to the [`CREATE VIRTUAL CLUSTER`]({% link {{ page.version.version }}/create-virtual-cluster.md %}) statement. diff --git a/src/current/v25.2/create-logical-replication-stream.md b/src/current/v25.2/create-logical-replication-stream.md index 5cde1b6ee71..2c835020284 100644 --- a/src/current/v25.2/create-logical-replication-stream.md +++ b/src/current/v25.2/create-logical-replication-stream.md @@ -5,8 +5,6 @@ toc: true --- {{site.data.alerts.callout_info}} -{% include feature-phases/preview.md %} - Logical data replication is only supported in CockroachDB {{ site.data.products.core }} clusters. {{site.data.alerts.end}} @@ -66,14 +64,6 @@ Option | Description `cursor` | Emits any changes after the specified timestamp. LDR will not perform an initial backfill with the `cursor` option, it will stream any changes after the specified timestamp. The LDR job will encounter an error if you specify a `cursor` timestamp that is before the configured [garbage collection]({% link {{ page.version.version }}/architecture/storage-layer.md %}#garbage-collection) window for that table. **Warning:** Apply the `cursor` option carefully to LDR streams. Using a timestamp in error could cause data loss. `discard` | ([**Unidirectional LDR only**]({% link {{ page.version.version }}/logical-data-replication-overview.md %}#use-cases)) Ignore [TTL deletes]({% link {{ page.version.version }}/row-level-ttl.md %}) in an LDR stream with `discard = ttl-deletes`. **Note**: To ignore row-level TTL deletes in an LDR stream, it is necessary to set the [`ttl_disable_changefeed_replication`]({% link {{ page.version.version }}/row-level-ttl.md %}#ttl-storage-parameters) storage parameter on the source table. Refer to the [Ignore row-level TTL deletes](#ignore-row-level-ttl-deletes) example. `label` | Tracks LDR metrics at the job level. Add a user-specified string with `label`. Refer to [Metrics labels]({% link {{ page.version.version }}/logical-data-replication-monitoring.md %}#metrics-labels). -`mode` | Determines how LDR replicates the data to the destination cluster. Possible values: `immediate`, `validated`. For more details, refer to [LDR modes](#ldr-modes). - -## LDR modes - -_Modes_ determine how LDR replicates the data to the destination cluster. There are two modes: - -- `immediate` (default): {% include {{ page.version.version }}/ldr/immediate-description.md %} -- `validated`: {% include {{ page.version.version }}/ldr/validated-description.md %} ## Bidirectional LDR diff --git a/src/current/v25.2/create-logically-replicated.md b/src/current/v25.2/create-logically-replicated.md index b1aabba6d06..7c6206286a9 100644 --- a/src/current/v25.2/create-logically-replicated.md +++ b/src/current/v25.2/create-logically-replicated.md @@ -5,8 +5,6 @@ toc: true --- {{site.data.alerts.callout_info}} -{% include feature-phases/preview.md %} - Logical data replication is only supported in CockroachDB {{ site.data.products.core }} clusters. {{site.data.alerts.end}} @@ -81,14 +79,6 @@ Option | Description -------+------------ `bidirectional on` / `unidirectional` | (**Required**) Specifies whether the LDR stream will be unidirectional or bidirectional. With `bidirectional on` specified, LDR will set up two LDR streams between the clusters. Refer to the examples for [unidirectional](#unidirectional) and [bidirectional](#bidirectional). `label` | Tracks LDR metrics at the job level. Add a user-specified string with `label`. For more details, refer to [Metrics labels]({% link {{ page.version.version }}/logical-data-replication-monitoring.md %}#metrics-labels). -`mode` | Determines how LDR replicates the data to the destination cluster. Possible values: `immediate`, `validated`. For more details, refer to [LDR modes](#ldr-modes). - -## LDR modes - -_Modes_ determine how LDR replicates the data to the destination cluster. There are two modes: - -- `immediate` (default): {% include {{ page.version.version }}/ldr/immediate-description.md %} -- `validated`: {% include {{ page.version.version }}/ldr/validated-description.md %} ## Examples @@ -100,7 +90,7 @@ From the destination cluster of the LDR stream, run: {% include_cached copy-clipboard.html %} ~~~ sql -CREATE LOGICALLY REPLICATED TABLE {database.public.destination_table_name} FROM TABLE {database.public.source_table_name} ON 'external://source' WITH unidirectional, mode=validated; +CREATE LOGICALLY REPLICATED TABLE {database.public.destination_table_name} FROM TABLE {database.public.source_table_name} ON 'external://source' WITH unidirectional; ~~~ Include the following: diff --git a/src/current/v25.2/logical-data-replication-monitoring.md b/src/current/v25.2/logical-data-replication-monitoring.md index d939cd223d6..240251f03e0 100644 --- a/src/current/v25.2/logical-data-replication-monitoring.md +++ b/src/current/v25.2/logical-data-replication-monitoring.md @@ -6,8 +6,6 @@ docs_area: manage --- {{site.data.alerts.callout_info}} -{% include feature-phases/preview.md %} - Logical data replication is only supported in CockroachDB {{ site.data.products.core }} clusters. {{site.data.alerts.end}} diff --git a/src/current/v25.2/logical-data-replication-overview.md b/src/current/v25.2/logical-data-replication-overview.md index 07fa654be20..84d134883ee 100644 --- a/src/current/v25.2/logical-data-replication-overview.md +++ b/src/current/v25.2/logical-data-replication-overview.md @@ -5,8 +5,6 @@ toc: true --- {{site.data.alerts.callout_info}} -{% include feature-phases/preview.md %} - Logical data replication is only supported in CockroachDB {{ site.data.products.core }} clusters. {{site.data.alerts.end}} @@ -44,7 +42,6 @@ Isolate critical application workloads from non-critical application workloads. - **Table-level replication**: When you initiate LDR, it will replicate all of the source table's existing data to the destination table. From then on, LDR will replicate the source table's data to the destination table to achieve eventual consistency. - **Last write wins conflict resolution**: LDR uses [_last write wins (LWW)_ conflict resolution]({% link {{ page.version.version }}/manage-logical-data-replication.md %}#conflict-resolution), which will use the latest [MVCC]({% link {{ page.version.version }}/architecture/storage-layer.md %}#mvcc) timestamp to resolve a conflict in row insertion. - **Dead letter queue (DLQ)**: When LDR starts, the job will create a [DLQ table]({% link {{ page.version.version }}/manage-logical-data-replication.md %}#dead-letter-queue-dlq) with each replicating table in order to track unresolved conflicts. You can interact and manage this table like any other SQL table. -- **Replication modes**: LDR offers different [_modes_]({% link {{ page.version.version }}/create-logical-replication-stream.md %}#ldr-modes) that apply data differently during replication, which allows you to consider optimizing for throughput or constraints during replication. - **Monitoring**: To [monitor]({% link {{ page.version.version }}/logical-data-replication-monitoring.md %}) LDR's initial progress, current status, and performance, you can view metrics available in the DB Console, Prometheus, and Metrics Export. ## Get started diff --git a/src/current/v25.2/manage-logical-data-replication.md b/src/current/v25.2/manage-logical-data-replication.md index 4256f9479eb..194a75955de 100644 --- a/src/current/v25.2/manage-logical-data-replication.md +++ b/src/current/v25.2/manage-logical-data-replication.md @@ -5,8 +5,6 @@ toc: true --- {{site.data.alerts.callout_info}} -{% include feature-phases/preview.md %} - Logical data replication is only supported in CockroachDB {{ site.data.products.core }} clusters. {{site.data.alerts.end}} @@ -22,7 +20,7 @@ In LDR, conflicts are detected at both the [KV]({% link {{ page.version.version ### KV level conflicts -LDR uses _last write wins (LWW)_ conflict resolution based on the [MVCC timestamp]({% link {{ page.version.version }}/architecture/storage-layer.md %}#mvcc) of the replicating write. LDR will resolve conflicts by inserting the row with the latest MVCC timestamp. Conflicts at the KV level are detected in both `immediate` and `validated` mode. +LDR uses _last write wins (LWW)_ conflict resolution based on the [MVCC timestamp]({% link {{ page.version.version }}/architecture/storage-layer.md %}#mvcc) of the replicating write. LDR will resolve conflicts by inserting the row with the latest MVCC timestamp. Conflicts at the KV level are detected when there is either: @@ -31,20 +29,11 @@ Conflicts at the KV level are detected when there is either: ### SQL level conflicts -In `validated` mode, when a conflict cannot apply due to violating [constraints]({% link {{ page.version.version }}/constraints.md %}), for example, a foreign key constraint or schema constraint, it will be retried for up to a minute and then put in the [DLQ](#dead-letter-queue-dlq) if it could not be resolved. +When a conflict cannot apply due to violating [constraints]({% link {{ page.version.version }}/set-up-logical-data-replication.md %}#schema-validation), for example, a schema constraint, LDR will send the row to the [DLQ](#dead-letter-queue-dlq). ### Dead letter queue (DLQ) -When the LDR job starts, it will create a DLQ table with each replicating table so that unresolved conflicts can be tracked. The DLQ will contain the writes that LDR cannot apply after the retry period, which could occur if: - -- The destination table was dropped. -- The destination cluster is unavailable. -- Tables schemas do not match. - -In `validated` mode, rows are also sent to the DLQ when: - -- [Foreign key]({% link {{ page.version.version }}/foreign-key.md %}) dependencies are not met where there are foreign key constraints in the schema. -- Unique indexes and other constraints are not met. +When the LDR job starts, it will create a DLQ table with each replicating table so that unresolved conflicts can be tracked. The DLQ will contain the writes that LDR cannot apply after the retry period of a minute, which could occur if there is a unique index on the destination table (for more details, refer to [Unique seconday indexes]({% link {{ page.version.version }}/set-up-logical-data-replication.md %}#unique-secondary-indexes)). {{site.data.alerts.callout_info}} LDR will not pause when the writes are sent to the DLQ, you must manage the DLQ manually. @@ -102,7 +91,12 @@ There are some supported schema changes, which you can perform during LDR **with Allowlist schema change | Exceptions -------------------+----------- [`CREATE INDEX`]({% link {{ page.version.version }}/create-index.md %}) | +New in v25.2: [`ALTER INDEX ... RENAME`]({% link {{ page.version.version }}/alter-index.md %}#rename-to) | N/A +New in v25.2: [`ALTER INDEX ... NOT VISIBLE`]({% link {{ page.version.version }}/alter-index.md %}#not-visible) | N/A [`DROP INDEX`]({% link {{ page.version.version }}/drop-index.md %}) | N/A +New in v25.2: [`ALTER TABLE ... ALTER COLUMN ... SET DEFAULT`]({% link {{ page.version.version }}/alter-table.md %}#alter-column) | N/A +New in v25.2: [`ALTER TABLE ... ALTER COLUMN ... DROP DEFAULT`]({% link {{ page.version.version }}/alter-table.md %}#alter-column) | N/A +New in v25.2: [`ALTER TABLE ... ALTER COLUMN ... SET VISIBLE`]({% link {{ page.version.version }}/alter-table.md %}#set-the-visibility-of-a-column) | N/A [Zone configuration]({% link {{ page.version.version }}/show-zone-configurations.md %}) changes | N/A [`ALTER TABLE ... CONFIGURE ZONE`]({% link {{ page.version.version }}/alter-table.md %}#configure-zone) | N/A [`ALTER TABLE ... SET/RESET {TTL storage parameters}`]({% link {{ page.version.version }}/row-level-ttl.md %}#ttl-storage-parameters) | diff --git a/src/current/v25.2/set-up-logical-data-replication.md b/src/current/v25.2/set-up-logical-data-replication.md index 4df892b4217..c4b9dedb2a1 100644 --- a/src/current/v25.2/set-up-logical-data-replication.md +++ b/src/current/v25.2/set-up-logical-data-replication.md @@ -5,8 +5,6 @@ toc: true --- {{site.data.alerts.callout_info}} -{% include feature-phases/preview.md %} - Logical data replication is only supported in CockroachDB {{ site.data.products.core }} clusters. {{site.data.alerts.end}} @@ -39,7 +37,7 @@ If you're setting up bidirectional LDR, both clusters will act as a source and a 1. Prepare the clusters with the required settings, users, and privileges according to the LDR setup. 1. Set up [external connection(s)]({% link {{ page.version.version }}/create-external-connection.md %}) on the destination to hold the connection URI for the source. -1. Start LDR from the destination cluster with your required modes and syntax. +1. Start LDR from the destination cluster with your required syntax and options. 1. Check the status of the LDR job in the [DB Console]({% link {{ page.version.version }}/ui-overview.md %}). ## Before you begin @@ -62,15 +60,62 @@ You cannot use LDR on a table with a schema that contains: - [Partial indexes]({% link {{ page.version.version }}/partial-indexes.md %}) and [hash-sharded indexes]({% link {{ page.version.version }}/hash-sharded-indexes.md %}) - Indexes with a [virtual computed column]({% link {{ page.version.version }}/computed-columns.md %}) - Composite types in the [primary key]({% link {{ page.version.version }}/primary-key.md %}) +- [Foreign key]({% link {{ page.version.version }}/foreign-key.md %}) dependencies Additionally, for the `CREATE LOGICALLY REPLICATED` syntax, you cannot use LDR on a table with a schema that contains: - [User-defined types]({% link {{ page.version.version }}/enum.md %}) -- [Foreign key]({% link {{ page.version.version }}/foreign-key.md %}) dependencies For more details, refer to the LDR [Known limitations]({% link {{ page.version.version }}/logical-data-replication-overview.md %}#known-limitations). -When you run LDR in [`immediate` mode](#modes), you cannot replicate a table with [foreign key constraints]({% link {{ page.version.version }}/foreign-key.md %}). In [`validated` mode](#modes), foreign key constraints **must** match. All constraints are enforced at the time of SQL/application write. +#### Unique secondary indexes + +When the destination table includes unique [secondary indexes]({% link {{ page.version.version }}/schema-design-indexes.md %}), it can cause rows to enter the [_dead letter queue_ (DLQ)]({% link {{ page.version.version }}/manage-logical-data-replication.md %}). The two clusters in LDR operate independently, so writes to one cluster can conflict with writes to the other. + +If the application modifies the same row in both clusters, LDR resolves the conflict using [_last write wins_ (LWW)]({% link {{ page.version.version }}/manage-logical-data-replication.md %}#conflict-resolution) conflict resolution. [`UNIQUE` constraints]({% link {{ page.version.version }}/unique.md %}) are validated locally in each cluster, therefore if a replicated write violates a `UNIQUE` constraint on the destination cluster (possibly because a conflicting write was already applied to the row) the replicating row will be applied to the DLQ. + +For example, consider a table with a unique `name` column where the following operations occur in this order in a source and destination cluster running LDR: + +On the **source cluster**: + +{% include_cached copy-clipboard.html %} +~~~ sql +-- writes to the source table +INSERT INTO city (1, nyc); -- timestamp 1 +UPDATE city SET name = 'philly' WHERE id = 1; -- timestamp 2 +INSERT INTO city (100, nyc); -- timestamp 3 +~~~ + +LDR replicates the write to the **destination cluster**: + +{% include_cached copy-clipboard.html %} +~~~ sql +-- replicates to the destination table +INSERT INTO city (100, nyc); -- timestamp 4 +~~~ + +_Timestamp 5:_ [Range]({% link {{ page.version.version }}/architecture/glossary.md %}#range) containing primary key `1` on the destination cluster is unavailable for a few minutes due to a [network partition]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#network-partition). + +_Timestamp 6:_ On the destination cluster, LDR attempts to replicate the row `(1, nyc)`, but it enters the retry queue for 1 minute due to the unavailable range. LDR adds `1, nyc` to the DLQ table after retrying and observing the `UNIQUE` constraint violation: + +{% include_cached copy-clipboard.html %} +~~~ sql +-- writes to the DLQ +INSERT INTO city (1, nyc); -- timestamp 6 +~~~ + +_Timestamp 7:_ LDR continues to replicate writes: + +{% include_cached copy-clipboard.html %} +~~~ sql +-- replicates to the destination table +INSERT INTO city (1, philly); -- timestamp 7 +~~~ + +To prevent expected DLQ entries and allow LDR to be eventually consistent, we recommend: + +- For **unidirectional** LDR, validate unique index constraints on the source cluster only. +- For **bidirectional** LDR, remove unique index constraints on both clusters. ## Step 1. Prepare the cluster @@ -186,11 +231,6 @@ You can use the `cockroach encode-uri` command to generate a connection string c In this step, you'll start the LDR stream(s) from the destination cluster. You can replicate one or multiple tables in a single LDR job. You cannot replicate system tables in LDR, which means that you must manually apply configurations and cluster settings, such as [row-level TTL]({% link {{ page.version.version }}/row-level-ttl.md %}) and user permissions on the destination cluster. -_Modes_ determine how LDR replicates the data to the destination cluster. There are two modes: - -- `immediate` (default): {% include {{ page.version.version }}/ldr/immediate-description.md %} -- `validated`: {% include {{ page.version.version }}/ldr/validated-description.md %} - LDR streams can be started using one of the following sections for instructions on creating an LDR stream. For details on which syntax to use, refer to the [Syntax](#syntax) section at the beginning of this tutorial: - [`CREATE LOGICALLY REPLICATED`](#create-logically-replicated) @@ -234,8 +274,6 @@ Ensure you've created the table on the destination cluster with a matching schem CREATE LOGICAL REPLICATION STREAM FROM TABLE {database.public.source_table_name} ON 'external://{source_external_connection}' INTO TABLE {database.public.destination_table_name}; ~~~ -You can change the default `mode` using the `WITH mode = validated` syntax. - If you would like to add multiple tables to the LDR job, ensure that the table name in the source table list and destination table list are in the same order: {% include_cached copy-clipboard.html %}