Document bootstrap_strategy (#3437)

andreyaksenov · web-flow · commit 11e146a61614 · 2023-04-11T15:56:40.000+03:00
Resolves #3296
diff --git a/doc/concepts/replication/repl_architecture.rst b/doc/concepts/replication/repl_architecture.rst
@@ -242,9 +242,8 @@ The maximal number of replicas in a mesh is 32.
 Orphan status
 -------------
 
-During ``box.cfg()``, an instance will try to join all masters listed
+During ``box.cfg()``, an instance tries to join all nodes listed
 in :ref:`box.cfg.replication <cfg_replication-replication>`.
-If the instance does not succeed with at least
-the number of masters specified in
-:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`,
-then it will switch to :ref:`orphan status <internals-replication-orphan_status>`.
+If the instance does not succeed with connecting to the required number of nodes
+(see :ref:`bootstrap_strategy <cfg_replication-bootstrap_strategy>`),
+it switches to the :ref:`orphan status <internals-replication-orphan_status>`.
diff --git a/doc/dev_guide/internals/replication/orphan.rst b/doc/dev_guide/internals/replication/orphan.rst
@@ -6,12 +6,11 @@ Orphan status
 
 Starting with Tarantool version 1.9, there is a change to the
 procedure when an instance joins a replica set.
-During ``box.cfg()`` the instance will try to join all masters listed
+During ``box.cfg()`` the instance tries to join all nodes listed
 in :ref:`box.cfg.replication <cfg_replication-replication>`.
-If the instance does not succeed with at least
-the number of masters specified in
-:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`,
-then it will switch to **orphan status**.
+If the instance does not succeed with connecting to the required number of nodes
+(see :ref:`bootstrap_strategy <cfg_replication-bootstrap_strategy>`),
+it switches to the **orphan** status.
 While an instance is in orphan status, it is read-only.
 
 To "join" a master, a replica instance must "connect" to the
@@ -34,12 +33,10 @@ is less than or equal to the number of seconds specified in
 If ``replication_sync_lag`` is unset (nil) or set to TIMEOUT_INFINITY, then
 the replica skips the "sync" state and switches to "follow" immediately.
 
-In order to leave orphan mode you need to sync with a sufficient number
-(:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`) of
-instances. To do so, you may either:
+In order to leave orphan mode, you need to sync with a sufficient number of
+instances (:ref:`bootstrap_strategy <cfg_replication-bootstrap_strategy>`).
+To do so, you may either:
 
-*   Set :ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`
-    to a lower value.
 *   Reset ``box.cfg.replication`` to exclude instances that cannot be reached
     or synced with.
 *   Set ``box.cfg.replication`` to ``""`` (empty string).
@@ -53,16 +50,15 @@ The following situations are possible.
 Here ``box.cfg{}`` is being called for the first time.
 A replica is joining but no replica set exists yet.
 
-    1.  Set status to 'orphan'.
-    2.  Try to connect to all nodes from ``box.cfg.replication``,
-        or to the number of nodes required by
-        :ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`.
-        Retrying up to 3 times in 30 seconds is possible because this is bootstrap,
+    1.  Set the status to 'orphan'.
+
+    2.  Try to connect to all nodes from ``box.cfg.replication``.
+        The replica tries to connect for the
         :ref:`replication_connect_timeout <cfg_replication-replication_connect_timeout>`
-        is overridden.
+        number of seconds and retries each
+        :ref:`replication_timeout <cfg_replication-replication_timeout>` seconds if needed.
 
-    3.  Abort and throw an error if not connected to all nodes in ``box.cfg.replication`` or
-        :ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`.
+    3.  Abort and throw an error if a replica is not connected to the majority of nodes in ``box.cfg.replication``.
 
     4.  This instance might be elected as the replica set 'leader'.
         Criteria for electing a leader include vclock value (largest is best),
@@ -93,13 +89,11 @@ It is being called again in order to perform recovery.
     1.  Perform :ref:`recovery <internals-recovery_process>` from the last local
         snapshot and the WAL files.
 
-    2.  Connect to at least
-        :ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`
-        nodes. If failed -- set status to 'orphan'.
-        (Attempts to sync will continue in the background and when/if they succeed
-        then 'orphan' will be changed to 'connected'.)
+    2.  Try to establish connections to all other nodes for the
+        :ref:`replication_connect_timeout <cfg_replication-replication_connect_timeout>` number of seconds.
+        Once ``replication_connect_timeout`` is expired or all the connections are established, proceed to the "sync" state with all the established connections.
 
-    3.  If connected - sync with all connected nodes, until the difference is not more than
+    3.  If connected, sync with all connected nodes, until the difference is not more than
         :ref:`replication_sync_lag <cfg_replication-replication_sync_lag>` seconds.
 
 ..  _replication-configuration_update:
@@ -111,8 +105,6 @@ It is being called again because some replication parameter
 or something in the replica set has changed.
 
     1.  Try to connect to all nodes from ``box.cfg.replication``,
-        or to the number of nodes required by
-        :ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`,
         within the time period specified in
         :ref:`replication_connect_timeout <cfg_replication-replication_connect_timeout>`.
 
diff --git a/doc/how-to/replication/repl_leader_elect.rst b/doc/how-to/replication/repl_leader_elect.rst
@@ -44,13 +44,12 @@ Configuration
 *   ``election_fencing_enabled`` -- switches the :ref:`leader fencing mode <repl_leader_elect_fencing>` on and off.
     For the details, refer to the :ref:`option description <cfg_replication-election_fencing_enabled>` in the configuration reference.
 
-Besides, it is important to know that
-being a leader is not the only requirement for a node to be writable.
-A leader node should have its :ref:`read_only <cfg_basic-read_only>` option set
-to ``false`` (``box.cfg{read_only = false}``),
-and its :ref:`connectivity quorum <cfg_replication-replication_connect_quorum>`
-should be satisfied (``box.cfg{replication_connect_quorum = <count>}``)
-or disabled (``box.cfg{replication_connect_quorum = 0}``).
+It is important to know that being a leader is not the only requirement for a node to be writable.
+The leader should also satisfy the following requirements:
+
+*   The :ref:`read_only <cfg_basic-read_only>` option is set to ``false``.
+
+*   The leader shouldn't be in the orphan state.
 
 Nothing prevents from setting the ``read_only`` option to ``true``,
 but the leader just won't be writable then. The option doesn't affect the
diff --git a/doc/how-to/vshard_quick.rst b/doc/how-to/vshard_quick.rst
@@ -131,7 +131,6 @@ The configuration of a simple sharded cluster can look like this:
 
     local cfg = {
         memtx_memory = 100 * 1024 * 1024,
-        replication_connect_quorum = 0,
         bucket_count = 10000,
         rebalancer_disbalance_threshold = 10,
         rebalancer_max_receiving = 100,
diff --git a/doc/reference/configuration/cfg_replication.rst b/doc/reference/configuration/cfg_replication.rst
@@ -1,5 +1,6 @@
 * :ref:`replication <cfg_replication-replication>`
 * :ref:`replication_anon <cfg_replication-replication_anon>`
+* :ref:`bootstrap_strategy <cfg_replication-bootstrap_strategy>`
 * :ref:`replication_connect_timeout <cfg_replication-replication_connect_timeout>`
 * :ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`
 * :ref:`replication_skip_conflict <cfg_replication-replication_skip_conflict>`
@@ -204,6 +205,29 @@
     | Environment variable: TT_REPLICATION_ANON
     | Dynamic: **yes**
 
+
+.. _cfg_replication-bootstrap_strategy:
+
+.. confval:: bootstrap_strategy
+
+    Since version 2.11.
+    Specifies a strategy used to bootstrap a :ref:`replica set <replication-bootstrap>`.
+    The following strategies are available:
+
+    *   ``auto``: a node doesn't boot if a half or more of other nodes in a replica set are not connected.
+        For example, if the :ref:`replication <cfg_replication-replication>` parameter contains 2 or 3 nodes,
+        a node requires 2 connected instances.
+        In the case of 4 or 5 nodes, at least 3 connected instances are required.
+        Moreover, a bootstrap leader fails to boot unless every connected node has chosen it as a bootstrap leader.
+
+    *   ``legacy``: a node requires the :ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>` number of other nodes to be connected.
+
+    | Type: string
+    | Default: auto
+    | Environment variable: TT_BOOTSTRAP_STRATEGY
+    | Dynamic: **yes**
+
+
 .. _cfg_replication-replication_connect_timeout:
 
 .. confval:: replication_connect_timeout
@@ -228,25 +252,21 @@
 .. confval:: replication_connect_quorum
 
     Since version 1.9.0.
-    By default a replica will try to connect to all the masters,
-    or it will not start. (The default is recommended so that all replicas
-    will receive the same replica set UUID.)
+    Specifies the number of nodes to be up and running to start a replica set.
+    Since version 2.11, this option is in effect if :ref:`bootstrap_strategy <cfg_replication-bootstrap_strategy>`
+    is set to ``legacy``.
 
-    However, by specifying ``replication_connect_quorum = N``, where
-    N is a number greater than or equal to zero,
-    users can state that the replica only needs to connect to N masters.
-
-    This parameter has effect during bootstrap and during
+    This parameter has effect during :ref:`bootstrap <replication-leader>` or
     :ref:`configuration update <replication-configuration_update>`.
-    Setting ``replication_connect_quorum = 0`` makes Tarantool
+    Setting ``replication_connect_quorum`` to ``0`` makes Tarantool
     require no immediate reconnect only in case of recovery.
-    See :ref:`orphan status <replication-orphan_status>` for details.
+    See :ref:`Orphan status <replication-orphan_status>` for details.
 
     Example:
 
     .. code-block:: lua
 
-        box.cfg{replication_connect_quorum=2}
+        box.cfg { replication_connect_quorum = 2 }
 
     | Type: integer
     | Default: null
@@ -313,9 +333,8 @@
 .. confval:: replication_sync_timeout
 
     Since version 1.10.2.
-    The number of seconds that a replica will wait when trying to
-    sync with a master in a cluster,
-    or a :ref:`quorum <cfg_replication-replication_connect_quorum>` of masters,
+    The number of seconds that a node waits when trying to sync with
+    other nodes in a replica set (see :ref:`bootstrap_strategy <cfg_replication-bootstrap_strategy>`),
     after connecting or during :ref:`configuration update <replication-configuration_update>`.
     This could fail indefinitely if ``replication_sync_lag`` is smaller
     than network latency, or if the replica cannot keep pace with master
diff --git a/doc/reference/reference_lua/box_cfg.rst b/doc/reference/reference_lua/box_cfg.rst
@@ -73,7 +73,6 @@ default settings to all the parameters:
       replicaset_uuid              = nil -- generated automatically
       replication                  = nil
       replication_anon             = false
-      replication_connect_quorum   = nil
       replication_connect_timeout  = 30
       replication_skip_conflict    = false
       replication_sync_lag         = 10