From b9c2f1a41ed2c9ba437bad794509f81e4804dfd6 Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Thu, 22 Aug 2013 17:22:28 -0400 Subject: [PATCH] DOCS-426 socket exception when primary steps down --- source/core/replica-set-elections.txt | 6 ++--- source/faq/diagnostics.txt | 2 +- source/tutorial/troubleshoot-replica-sets.txt | 24 +++++++++++++++++++ 3 files changed, 28 insertions(+), 4 deletions(-) diff --git a/source/core/replica-set-elections.txt b/source/core/replica-set-elections.txt index 3307e963510..064a2920079 100644 --- a/source/core/replica-set-elections.txt +++ b/source/core/replica-set-elections.txt @@ -78,9 +78,9 @@ If you have a three-member replica set, where every member has one vote, the set can elect a primary as long as two members can connect to each other. If two members are unavailable, the remaining member remains a :term:`secondary` because it cannot connect to a -majority of the set's members. - -While there is no primary, clients cannot write to the replica set. +majority of the set's members. If the remaining member is a +:term:`primary` and two members become unavailable, the primary steps +down and becomes and secondary. Network Partitions ~~~~~~~~~~~~~~~~~~ diff --git a/source/faq/diagnostics.txt b/source/faq/diagnostics.txt index c6458864d17..02420a380fa 100644 --- a/source/faq/diagnostics.txt +++ b/source/faq/diagnostics.txt @@ -31,7 +31,7 @@ following commands: sudo grep mongod /var/log/messages sudo grep score /var/log/messages -.. _faq-troubleshooting: +.. _faq-keepalive: Does TCP ``keepalive`` time affect sharded clusters and replica sets? --------------------------------------------------------------------- diff --git a/source/tutorial/troubleshoot-replica-sets.txt b/source/tutorial/troubleshoot-replica-sets.txt index 3456f46479a..d9f13d577fa 100644 --- a/source/tutorial/troubleshoot-replica-sets.txt +++ b/source/tutorial/troubleshoot-replica-sets.txt @@ -177,6 +177,30 @@ Consider the following example of a bidirectional test of networking: and firewall configuration and reconfigure your environment to allow these connections. +Socket Exceptions when Rebooting Secondaries Simultaneously +----------------------------------------------------------- + +When you reboot members of a replica set, make sure the set keeps a +majority of its votes available. When a set's active members can no +longer form a majority, the set's :term:`primary` steps down and becomes +a :term:`secondary`. The former primary closes all open connections to +client applications. Clients attempting to write to the former primary +receive socket exceptions and "Connection reset" errors. + +For example, if you have a three-member replica set where every member +has one vote, the set can elect a primary only as long as two members +can connect to each other. If two members become unavailable and the +remaining member is a primary, the primary steps down and becomes a +secondary. + +When a majority of votes is available, the set elects a primary +and the errors cease. + +To avoid errors, reboot replica set member's one at a time. + +For more information on votes, see :doc:`/core/replica-set-elections`. For +related information on connection errors, see :ref:`faq-keepalive`. + .. _replica-set-troubleshooting-check-oplog-size: Check the Size of the Oplog