Skip to content

Update LDR docs for GA & remove validated #19626

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 0 additions & 8 deletions src/current/v24.3/create-logical-replication-stream.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,14 +50,6 @@ Option | Description
`cursor` | Emits any changes after the specified timestamp. LDR will not perform an initial backfill with the `cursor` option, it will stream any changes after the specified timestamp. The LDR job will encounter an error if you specify a `cursor` timestamp that is before the configured [garbage collection]({% link {{ page.version.version }}/architecture/storage-layer.md %}#garbage-collection) window for that table. **Warning:** Apply the `cursor` option carefully to LDR streams. Using a timestamp in error could cause data loss.
<a id="discard-ttl-deletes-option"></a>`discard` | ([**Unidirectional LDR only**]({% link {{ page.version.version }}/logical-data-replication-overview.md %}#use-cases)) Ignore [TTL deletes]({% link {{ page.version.version }}/row-level-ttl.md %}) in an LDR stream with `discard = ttl-deletes`. **Note**: To ignore row-level TTL deletes in an LDR stream, it is necessary to set the [`ttl_disable_changefeed_replication`]({% link {{ page.version.version }}/row-level-ttl.md %}#ttl-storage-parameters) storage parameter on the source table. Refer to the [Ignore row-level TTL deletes](#ignore-row-level-ttl-deletes) example.
`label` | Tracks LDR metrics at the job level. Add a user-specified string with `label`. Refer to [Metrics labels]({% link {{ page.version.version }}/logical-data-replication-monitoring.md %}#metrics-labels).
`mode` | Determines how LDR replicates the data to the destination cluster. Possible values: `immediate`, `validated`. For more details, refer to [LDR modes](#ldr-modes).

## LDR modes

_Modes_ determine how LDR replicates the data to the destination cluster. There are two modes:

- `immediate` (default): {% include {{ page.version.version }}/ldr/immediate-description.md %}
- `validated`: {% include {{ page.version.version }}/ldr/validated-description.md %}

## Bidirectional LDR

Expand Down
1 change: 0 additions & 1 deletion src/current/v24.3/logical-data-replication-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,6 @@ Isolate critical application workloads from non-critical application workloads.
- **Table-level replication**: When you initiate LDR, it will replicate all of the source table's existing data to the destination table. From then on, LDR will replicate the source table's data to the destination table to achieve eventual consistency.
- **Last write wins conflict resolution**: LDR uses [_last write wins (LWW)_ conflict resolution]({% link {{ page.version.version }}/manage-logical-data-replication.md %}#conflict-resolution), which will use the latest [MVCC]({% link {{ page.version.version }}/architecture/storage-layer.md %}#mvcc) timestamp to resolve a conflict in row insertion.
- **Dead letter queue (DLQ)**: When LDR starts, the job will create a [DLQ table]({% link {{ page.version.version }}/manage-logical-data-replication.md %}#dead-letter-queue-dlq) with each replicating table in order to track unresolved conflicts. You can interact and manage this table like any other SQL table.
- **Replication modes**: LDR offers different [_modes_]({% link {{ page.version.version }}/create-logical-replication-stream.md %}#ldr-modes) that apply data differently during replication, which allows you to consider optimizing for throughput or constraints during replication.
- **Monitoring**: To [monitor]({% link {{ page.version.version }}/logical-data-replication-monitoring.md %}) LDR's initial progress, current status, and performance, you can view metrics available in the DB Console, Prometheus, and Metrics Export.

## Get started
Expand Down
16 changes: 5 additions & 11 deletions src/current/v24.3/manage-logical-data-replication.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ In LDR, conflicts are detected at both the [KV]({% link {{ page.version.version

### KV level conflicts

LDR uses _last write wins (LWW)_ conflict resolution based on the [MVCC timestamp]({% link {{ page.version.version }}/architecture/storage-layer.md %}#mvcc) of the replicating write. LDR will resolve conflicts by inserting the row with the latest MVCC timestamp. Conflicts at the KV level are detected in both `immediate` and `validated` mode.
LDR uses _last write wins (LWW)_ conflict resolution based on the [MVCC timestamp]({% link {{ page.version.version }}/architecture/storage-layer.md %}#mvcc) of the replicating write. LDR will resolve conflicts by inserting the row with the latest MVCC timestamp.

Conflicts at the KV level are detected when there is either:

Expand All @@ -31,20 +31,14 @@ Conflicts at the KV level are detected when there is either:

### SQL level conflicts

In `validated` mode, when a conflict cannot apply due to violating [constraints]({% link {{ page.version.version }}/constraints.md %}), for example, a foreign key constraint or schema constraint, it will be retried for up to a minute and then put in the [DLQ](#dead-letter-queue-dlq) if it could not be resolved.
When a conflict cannot apply due to violating [constraints]({% link {{ page.version.version }}/set-up-logical-data-replication.md %}#schema-validation), for example, a schema constraint, LDR will send the row to the [DLQ](#dead-letter-queue-dlq).

### Dead letter queue (DLQ)

When the LDR job starts, it will create a DLQ table with each replicating table so that unresolved conflicts can be tracked. The DLQ will contain the writes that LDR cannot apply after the retry period, which could occur if:
When the LDR job starts, it will create a DLQ table with each replicating table so that unresolved conflicts can be tracked. The DLQ will contain the writes that LDR cannot apply after the retry period of a minute, which could occur if:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this list could be updated. I think it could read:

  • Presence of Unique index on destination table (see section below)
  • (only for 24.3) Loss of Quorum of the underlying ranges of the destination table

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also prevent DROP table on LDR tables on all versions given cockroachdb/cockroach#136172

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what you were thinking for the list (24.3) below @msbutler? lmk if I mixed things up at all...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for 24.3:

  • Presence of Unique index on destination table (see section below)
  • Loss of Quorum of the underlying ranges of the destination table

for 25.1/2

  • Presence of Unique index on destination table (see section below)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, did not see the updated commit. you can remove "table schemas do not match"


- The destination table was dropped.
- The destination cluster is unavailable.
- Tables schemas do not match.

In `validated` mode, rows are also sent to the DLQ when:

- [Foreign key]({% link {{ page.version.version }}/foreign-key.md %}) dependencies are not met where there are foreign key constraints in the schema.
- Unique indexes and other constraints are not met.
- [Loss of quorum]({% link {{ page.version.version }}/architecture/replication-layer.md %}#overview) of the underlying [ranges]({% link {{ page.version.version }}/architecture/reads-and-writes-overview.md %}#range) in the destination table.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oof. sorry i should have caught this earlier. the first 2 bullets are essentially describing the same thing, so you could delete the first one.

- There is a unique index on the destination table (for more details, refer to [Unique seconday indexes]({% link {{ page.version.version }}/set-up-logical-data-replication.md %}#unique-secondary-indexes)).

{{site.data.alerts.callout_info}}
LDR will not pause when the writes are sent to the DLQ, you must manage the DLQ manually.
Expand Down
61 changes: 48 additions & 13 deletions src/current/v24.3/set-up-logical-data-replication.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ If you're setting up bidirectional LDR, both clusters will act as a source and a

1. Prepare the tables on each cluster with the prerequisites for starting LDR.
1. Set up an [external connection]({% link {{ page.version.version }}/create-external-connection.md %}) on cluster B (which will be the destination cluster initially) to hold the connection URI for cluster A.
1. Start LDR from cluster B with your required modes.
1. Start LDR from cluster B with your required options.
1. (Optional) Run Steps 1 to 3 again with cluster B as the source and A as the destination, which starts LDR streaming from cluster B to A.
1. Check the status of the LDR job in the [DB Console]({% link {{ page.version.version }}/ui-overview.md %}).

Expand All @@ -36,10 +36,6 @@ You'll need:
- All nodes in each cluster will need access to the Certificate Authority for the other cluster. Refer to [Step 2. Connect from the destination to the source](#step-2-connect-from-the-destination-to-the-source).
- LDR replicates at the table level, which means clusters can contain other tables that are not part of the LDR job. If both clusters are empty, create the tables that you need to replicate with **identical** schema definitions (excluding indexes) on both clusters. If one cluster already has an existing table that you'll replicate, ensure the other cluster's table definition matches. For more details on the supported schemas, refer to [Schema Validation](#schema-validation).

{% comment %}To add later, after further dev work{{site.data.alerts.callout_info}}
If you need to run LDR through a load balancer, use the load balancer IP address as the SQL advertise address on each cluster. It is important to note that using a load balancer with LDR can impair performance.
{{site.data.alerts.end}}{% endcomment %}

To create bidirectional LDR, you can complete the [optional step](#step-4-optional-set-up-bidirectional-ldr) to start the second LDR job that sends writes from the table on cluster B to the table on cluster A.

### Schema validation
Expand All @@ -52,10 +48,56 @@ You cannot use LDR on a table with a schema that contains the following:
- [Partial indexes]({% link {{ page.version.version }}/partial-indexes.md %}) and [hash-sharded indexes]({% link {{ page.version.version }}/hash-sharded-indexes.md %})
- Indexes with a [virtual computed column]({% link {{ page.version.version }}/computed-columns.md %})
- Composite types in the [primary key]({% link {{ page.version.version }}/primary-key.md %})
- [Foreign key]({% link {{ page.version.version }}/foreign-key.md %}) dependencies

For more details, refer to the LDR [Known limitations]({% link {{ page.version.version }}/logical-data-replication-overview.md %}#known-limitations).

When you run LDR in [`immediate` mode](#modes), you cannot replicate a table with [foreign key constraints]({% link {{ page.version.version }}/foreign-key.md %}). In [`validated` mode](#modes), foreign key constraints **must** match. All constraints are enforced at the time of SQL/application write.
LDR does not support replicating a table with [foreign key constraints]({% link {{ page.version.version }}/foreign-key.md %}).

#### Unique secondary indexes

LDR cannot guarantee that the [_dead letter queue_ (DLQ)]({% link {{ page.version.version }}/manage-logical-data-replication.md %}) will remain empty if the destination table has a unique [secondary index]({% link {{ page.version.version }}/schema-design-indexes.md %}). The two clusters in LDR operate independently, so writes to one cluster can conflict with writes to the other.

If the application modifies the same row in both clusters, LDR resolves the conflict using _last write wins_ (LWW) conflict resolution. [`UNIQUE` constraints]({% link {{ page.version.version }}/unique.md %}) are validated locally in each cluster, therefore if a replicated write violates a `UNIQUE` constraint on the destination cluster (possibly because a conflicting write was already applied to the row) the replicating row will be applied to the DLQ.

For example, consider a table with a unique `name` column where the following operations occur in this order in a source and destination cluster running LDR:

On the **source cluster**:

{% include_cached copy-clipboard.html %}
~~~ sql
INSERT INTO city (1, nyc); -- timestamp 1
UPDATE city SET name = 'philly' WHERE id = 1; -- timestamp 2
INSERT INTO city (100, nyc); -- timestamp 3
~~~

LDR replicates the write to the **destination cluster**:

{% include_cached copy-clipboard.html %}
~~~ sql
INSERT INTO city (100, nyc); -- timestamp 4
~~~

_Timestamp 5:_ Range containing primary key `1` on the destination cluster is unavailable for a few minutes due to a network partition.

_Timestamp 6:_ On the destination cluster, LDR attempts to replicate the row `(1, nyc)`, but it enters the retry queue for 1 minute due to the unavailable range. LDR adds `1, nyc` to the DLQ after having retried for 1 minute and observing the `UNIQUE` constraint violation:

{% include_cached copy-clipboard.html %}
~~~ sql
INSERT INTO city (1, nyc); -- timestamp 6
~~~

_Timestamp 7:_ LDR continues replication writes:

{% include_cached copy-clipboard.html %}
~~~ sql
INSERT INTO city (1, philly); -- timestamp 7
~~~

To prevent expected DLQ entries and allow LDR to be eventually consistent, we recommend:

- For **unidirectional** LDR, validate unique index constraints on the source cluster only.
- For **bidirectional** LDR, remove unique index constraints on both clusters.

## Step 1. Prepare the cluster

Expand Down Expand Up @@ -117,20 +159,13 @@ You can use the `cockroach encode-uri` command to generate a connection string c
In this step, you'll start the LDR job from the destination cluster. You can replicate one or multiple tables in a single LDR job. You cannot replicate system tables in LDR, which means that you must manually apply configurations and cluster settings, such as [row-level TTL]({% link {{ page.version.version }}/row-level-ttl.md %}) and user permissions on the destination cluster.

<a id="modes"></a>_Modes_ determine how LDR replicates the data to the destination cluster. There are two modes:

- `immediate` (default): {% include {{ page.version.version }}/ldr/immediate-description.md %}
- `validated`: {% include {{ page.version.version }}/ldr/validated-description.md %}

1. From the **destination** cluster, start LDR. Use the fully qualified table name for the source and destination tables:

{% include_cached copy-clipboard.html %}
~~~ sql
CREATE LOGICAL REPLICATION STREAM FROM TABLE {database.public.source_table_name} ON 'external://{source_external_connection}' INTO TABLE {database.public.destination_table_name};
~~~

You can change the default `mode` using the `WITH mode = validated` syntax.

If you would like to add multiple tables to the LDR job, ensure that the table name in the source table list and destination table list are in the same order:

{% include_cached copy-clipboard.html %}
Expand Down
8 changes: 0 additions & 8 deletions src/current/v25.1/create-logical-replication-stream.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,14 +54,6 @@ Option | Description
`cursor` | Emits any changes after the specified timestamp. LDR will not perform an initial backfill with the `cursor` option, it will stream any changes after the specified timestamp. The LDR job will encounter an error if you specify a `cursor` timestamp that is before the configured [garbage collection]({% link {{ page.version.version }}/architecture/storage-layer.md %}#garbage-collection) window for that table. **Warning:** Apply the `cursor` option carefully to LDR streams. Using a timestamp in error could cause data loss.
<a id="discard-ttl-deletes-option"></a>`discard` | ([**Unidirectional LDR only**]({% link {{ page.version.version }}/logical-data-replication-overview.md %}#use-cases)) Ignore [TTL deletes]({% link {{ page.version.version }}/row-level-ttl.md %}) in an LDR stream with `discard = ttl-deletes`. **Note**: To ignore row-level TTL deletes in an LDR stream, it is necessary to set the [`ttl_disable_changefeed_replication`]({% link {{ page.version.version }}/row-level-ttl.md %}#ttl-storage-parameters) storage parameter on the source table. Refer to the [Ignore row-level TTL deletes](#ignore-row-level-ttl-deletes) example.
`label` | Tracks LDR metrics at the job level. Add a user-specified string with `label`. Refer to [Metrics labels]({% link {{ page.version.version }}/logical-data-replication-monitoring.md %}#metrics-labels).
`mode` | Determines how LDR replicates the data to the destination cluster. Possible values: `immediate`, `validated`. For more details, refer to [LDR modes](#ldr-modes).

## LDR modes

_Modes_ determine how LDR replicates the data to the destination cluster. There are two modes:

- `immediate` (default): {% include {{ page.version.version }}/ldr/immediate-description.md %}
- `validated`: {% include {{ page.version.version }}/ldr/validated-description.md %}

## Bidirectional LDR

Expand Down
10 changes: 1 addition & 9 deletions src/current/v25.1/create-logically-replicated.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,14 +55,6 @@ Option | Description
-------+------------
`bidirectional on` / `unidirectional` | (**Required**) Specifies whether the LDR stream will be unidirectional or bidirectional. With `bidirectional on` specified, LDR will set up two LDR streams between the clusters. Refer to the examples for [unidirectional](#unidirectional) and [bidirectional](#bidirectional).
`label` | Tracks LDR metrics at the job level. Add a user-specified string with `label`. For more details, refer to [Metrics labels]({% link {{ page.version.version }}/logical-data-replication-monitoring.md %}#metrics-labels).
`mode` | Determines how LDR replicates the data to the destination cluster. Possible values: `immediate`, `validated`. For more details, refer to [LDR modes](#ldr-modes).

## LDR modes

_Modes_ determine how LDR replicates the data to the destination cluster. There are two modes:

- `immediate` (default): {% include {{ page.version.version }}/ldr/immediate-description.md %}
- `validated`: {% include {{ page.version.version }}/ldr/validated-description.md %}

## Examples

Expand All @@ -74,7 +66,7 @@ From the destination cluster of the LDR stream, run:

{% include_cached copy-clipboard.html %}
~~~ sql
CREATE LOGICALLY REPLICATED TABLE {database.public.destination_table_name} FROM TABLE {database.public.source_table_name} ON 'external://source' WITH unidirectional, mode=validated;
CREATE LOGICALLY REPLICATED TABLE {database.public.destination_table_name} FROM TABLE {database.public.source_table_name} ON 'external://source' WITH unidirectional;
~~~

Include the following:
Expand Down
1 change: 0 additions & 1 deletion src/current/v25.1/logical-data-replication-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,6 @@ Isolate critical application workloads from non-critical application workloads.
- **Table-level replication**: When you initiate LDR, it will replicate all of the source table's existing data to the destination table. From then on, LDR will replicate the source table's data to the destination table to achieve eventual consistency.
- **Last write wins conflict resolution**: LDR uses [_last write wins (LWW)_ conflict resolution]({% link {{ page.version.version }}/manage-logical-data-replication.md %}#conflict-resolution), which will use the latest [MVCC]({% link {{ page.version.version }}/architecture/storage-layer.md %}#mvcc) timestamp to resolve a conflict in row insertion.
- **Dead letter queue (DLQ)**: When LDR starts, the job will create a [DLQ table]({% link {{ page.version.version }}/manage-logical-data-replication.md %}#dead-letter-queue-dlq) with each replicating table in order to track unresolved conflicts. You can interact and manage this table like any other SQL table.
- **Replication modes**: LDR offers different [_modes_]({% link {{ page.version.version }}/create-logical-replication-stream.md %}#ldr-modes) that apply data differently during replication, which allows you to consider optimizing for throughput or constraints during replication.
- **Monitoring**: To [monitor]({% link {{ page.version.version }}/logical-data-replication-monitoring.md %}) LDR's initial progress, current status, and performance, you can view metrics available in the DB Console, Prometheus, and Metrics Export.

## Get started
Expand Down
Loading
Loading