Skip to content

Commit 11e146a

Browse files
Document bootstrap_strategy (#3437)
Resolves #3296
1 parent 71e6fb0 commit 11e146a

File tree

6 files changed

+61
-54
lines changed

6 files changed

+61
-54
lines changed

doc/concepts/replication/repl_architecture.rst

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -242,9 +242,8 @@ The maximal number of replicas in a mesh is 32.
242242
Orphan status
243243
-------------
244244

245-
During ``box.cfg()``, an instance will try to join all masters listed
245+
During ``box.cfg()``, an instance tries to join all nodes listed
246246
in :ref:`box.cfg.replication <cfg_replication-replication>`.
247-
If the instance does not succeed with at least
248-
the number of masters specified in
249-
:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`,
250-
then it will switch to :ref:`orphan status <internals-replication-orphan_status>`.
247+
If the instance does not succeed with connecting to the required number of nodes
248+
(see :ref:`bootstrap_strategy <cfg_replication-bootstrap_strategy>`),
249+
it switches to the :ref:`orphan status <internals-replication-orphan_status>`.

doc/dev_guide/internals/replication/orphan.rst

Lines changed: 18 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,11 @@ Orphan status
66

77
Starting with Tarantool version 1.9, there is a change to the
88
procedure when an instance joins a replica set.
9-
During ``box.cfg()`` the instance will try to join all masters listed
9+
During ``box.cfg()`` the instance tries to join all nodes listed
1010
in :ref:`box.cfg.replication <cfg_replication-replication>`.
11-
If the instance does not succeed with at least
12-
the number of masters specified in
13-
:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`,
14-
then it will switch to **orphan status**.
11+
If the instance does not succeed with connecting to the required number of nodes
12+
(see :ref:`bootstrap_strategy <cfg_replication-bootstrap_strategy>`),
13+
it switches to the **orphan** status.
1514
While an instance is in orphan status, it is read-only.
1615

1716
To "join" a master, a replica instance must "connect" to the
@@ -34,12 +33,10 @@ is less than or equal to the number of seconds specified in
3433
If ``replication_sync_lag`` is unset (nil) or set to TIMEOUT_INFINITY, then
3534
the replica skips the "sync" state and switches to "follow" immediately.
3635

37-
In order to leave orphan mode you need to sync with a sufficient number
38-
(:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`) of
39-
instances. To do so, you may either:
36+
In order to leave orphan mode, you need to sync with a sufficient number of
37+
instances (:ref:`bootstrap_strategy <cfg_replication-bootstrap_strategy>`).
38+
To do so, you may either:
4039

41-
* Set :ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`
42-
to a lower value.
4340
* Reset ``box.cfg.replication`` to exclude instances that cannot be reached
4441
or synced with.
4542
* Set ``box.cfg.replication`` to ``""`` (empty string).
@@ -53,16 +50,15 @@ The following situations are possible.
5350
Here ``box.cfg{}`` is being called for the first time.
5451
A replica is joining but no replica set exists yet.
5552

56-
1. Set status to 'orphan'.
57-
2. Try to connect to all nodes from ``box.cfg.replication``,
58-
or to the number of nodes required by
59-
:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`.
60-
Retrying up to 3 times in 30 seconds is possible because this is bootstrap,
53+
1. Set the status to 'orphan'.
54+
55+
2. Try to connect to all nodes from ``box.cfg.replication``.
56+
The replica tries to connect for the
6157
:ref:`replication_connect_timeout <cfg_replication-replication_connect_timeout>`
62-
is overridden.
58+
number of seconds and retries each
59+
:ref:`replication_timeout <cfg_replication-replication_timeout>` seconds if needed.
6360

64-
3. Abort and throw an error if not connected to all nodes in ``box.cfg.replication`` or
65-
:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`.
61+
3. Abort and throw an error if a replica is not connected to the majority of nodes in ``box.cfg.replication``.
6662

6763
4. This instance might be elected as the replica set 'leader'.
6864
Criteria for electing a leader include vclock value (largest is best),
@@ -93,13 +89,11 @@ It is being called again in order to perform recovery.
9389
1. Perform :ref:`recovery <internals-recovery_process>` from the last local
9490
snapshot and the WAL files.
9591

96-
2. Connect to at least
97-
:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`
98-
nodes. If failed -- set status to 'orphan'.
99-
(Attempts to sync will continue in the background and when/if they succeed
100-
then 'orphan' will be changed to 'connected'.)
92+
2. Try to establish connections to all other nodes for the
93+
:ref:`replication_connect_timeout <cfg_replication-replication_connect_timeout>` number of seconds.
94+
Once ``replication_connect_timeout`` is expired or all the connections are established, proceed to the "sync" state with all the established connections.
10195

102-
3. If connected - sync with all connected nodes, until the difference is not more than
96+
3. If connected, sync with all connected nodes, until the difference is not more than
10397
:ref:`replication_sync_lag <cfg_replication-replication_sync_lag>` seconds.
10498

10599
.. _replication-configuration_update:
@@ -111,8 +105,6 @@ It is being called again because some replication parameter
111105
or something in the replica set has changed.
112106

113107
1. Try to connect to all nodes from ``box.cfg.replication``,
114-
or to the number of nodes required by
115-
:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`,
116108
within the time period specified in
117109
:ref:`replication_connect_timeout <cfg_replication-replication_connect_timeout>`.
118110

doc/how-to/replication/repl_leader_elect.rst

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -44,13 +44,12 @@ Configuration
4444
* ``election_fencing_enabled`` -- switches the :ref:`leader fencing mode <repl_leader_elect_fencing>` on and off.
4545
For the details, refer to the :ref:`option description <cfg_replication-election_fencing_enabled>` in the configuration reference.
4646

47-
Besides, it is important to know that
48-
being a leader is not the only requirement for a node to be writable.
49-
A leader node should have its :ref:`read_only <cfg_basic-read_only>` option set
50-
to ``false`` (``box.cfg{read_only = false}``),
51-
and its :ref:`connectivity quorum <cfg_replication-replication_connect_quorum>`
52-
should be satisfied (``box.cfg{replication_connect_quorum = <count>}``)
53-
or disabled (``box.cfg{replication_connect_quorum = 0}``).
47+
It is important to know that being a leader is not the only requirement for a node to be writable.
48+
The leader should also satisfy the following requirements:
49+
50+
* The :ref:`read_only <cfg_basic-read_only>` option is set to ``false``.
51+
52+
* The leader shouldn't be in the orphan state.
5453

5554
Nothing prevents from setting the ``read_only`` option to ``true``,
5655
but the leader just won't be writable then. The option doesn't affect the

doc/how-to/vshard_quick.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,6 @@ The configuration of a simple sharded cluster can look like this:
131131
132132
local cfg = {
133133
memtx_memory = 100 * 1024 * 1024,
134-
replication_connect_quorum = 0,
135134
bucket_count = 10000,
136135
rebalancer_disbalance_threshold = 10,
137136
rebalancer_max_receiving = 100,

doc/reference/configuration/cfg_replication.rst

Lines changed: 33 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
* :ref:`replication <cfg_replication-replication>`
22
* :ref:`replication_anon <cfg_replication-replication_anon>`
3+
* :ref:`bootstrap_strategy <cfg_replication-bootstrap_strategy>`
34
* :ref:`replication_connect_timeout <cfg_replication-replication_connect_timeout>`
45
* :ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`
56
* :ref:`replication_skip_conflict <cfg_replication-replication_skip_conflict>`
@@ -204,6 +205,29 @@
204205
| Environment variable: TT_REPLICATION_ANON
205206
| Dynamic: **yes**
206207
208+
209+
.. _cfg_replication-bootstrap_strategy:
210+
211+
.. confval:: bootstrap_strategy
212+
213+
Since version 2.11.
214+
Specifies a strategy used to bootstrap a :ref:`replica set <replication-bootstrap>`.
215+
The following strategies are available:
216+
217+
* ``auto``: a node doesn't boot if a half or more of other nodes in a replica set are not connected.
218+
For example, if the :ref:`replication <cfg_replication-replication>` parameter contains 2 or 3 nodes,
219+
a node requires 2 connected instances.
220+
In the case of 4 or 5 nodes, at least 3 connected instances are required.
221+
Moreover, a bootstrap leader fails to boot unless every connected node has chosen it as a bootstrap leader.
222+
223+
* ``legacy``: a node requires the :ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>` number of other nodes to be connected.
224+
225+
| Type: string
226+
| Default: auto
227+
| Environment variable: TT_BOOTSTRAP_STRATEGY
228+
| Dynamic: **yes**
229+
230+
207231
.. _cfg_replication-replication_connect_timeout:
208232

209233
.. confval:: replication_connect_timeout
@@ -228,25 +252,21 @@
228252
.. confval:: replication_connect_quorum
229253

230254
Since version 1.9.0.
231-
By default a replica will try to connect to all the masters,
232-
or it will not start. (The default is recommended so that all replicas
233-
will receive the same replica set UUID.)
255+
Specifies the number of nodes to be up and running to start a replica set.
256+
Since version 2.11, this option is in effect if :ref:`bootstrap_strategy <cfg_replication-bootstrap_strategy>`
257+
is set to ``legacy``.
234258

235-
However, by specifying ``replication_connect_quorum = N``, where
236-
N is a number greater than or equal to zero,
237-
users can state that the replica only needs to connect to N masters.
238-
239-
This parameter has effect during bootstrap and during
259+
This parameter has effect during :ref:`bootstrap <replication-leader>` or
240260
:ref:`configuration update <replication-configuration_update>`.
241-
Setting ``replication_connect_quorum = 0`` makes Tarantool
261+
Setting ``replication_connect_quorum`` to ``0`` makes Tarantool
242262
require no immediate reconnect only in case of recovery.
243-
See :ref:`orphan status <replication-orphan_status>` for details.
263+
See :ref:`Orphan status <replication-orphan_status>` for details.
244264

245265
Example:
246266

247267
.. code-block:: lua
248268
249-
box.cfg{replication_connect_quorum=2}
269+
box.cfg { replication_connect_quorum = 2 }
250270
251271
| Type: integer
252272
| Default: null
@@ -313,9 +333,8 @@
313333
.. confval:: replication_sync_timeout
314334

315335
Since version 1.10.2.
316-
The number of seconds that a replica will wait when trying to
317-
sync with a master in a cluster,
318-
or a :ref:`quorum <cfg_replication-replication_connect_quorum>` of masters,
336+
The number of seconds that a node waits when trying to sync with
337+
other nodes in a replica set (see :ref:`bootstrap_strategy <cfg_replication-bootstrap_strategy>`),
319338
after connecting or during :ref:`configuration update <replication-configuration_update>`.
320339
This could fail indefinitely if ``replication_sync_lag`` is smaller
321340
than network latency, or if the replica cannot keep pace with master

doc/reference/reference_lua/box_cfg.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,6 @@ default settings to all the parameters:
7373
replicaset_uuid = nil -- generated automatically
7474
replication = nil
7575
replication_anon = false
76-
replication_connect_quorum = nil
7776
replication_connect_timeout = 30
7877
replication_skip_conflict = false
7978
replication_sync_lag = 10

0 commit comments

Comments
 (0)