Skip to content

CLI tools: check_if_node_is_quorum_critical: reduce response wait time from peers that are stopped, unreachable or down #9755

@michaelklishin

Description

@michaelklishin

Previously discussed in #9522.

Currently rabbitmq-diagnostics check_if_node_is_quorum_critical does the following to find out if any of the queues or streams would lose their quorum if the current node is stopped:

  • List QQs with local replicas with minimum quorum
  • List streams with local replicas with minimum quorum
  • See if the list is blank or not

To find out if a QQ or stream has "minimum quorum" it contacts all running nodes, where the definition of "running" is that of rabbit_nodes:list_running/0, which contacts other nodes with a 10s timeout.

By using a local snapshot of cluster members (that is, without checking with other nodes to see if they are online/reachable), the effects of down nodes on CLI command return operation should be significantly reduced.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions