Skip to content

Commit 4927aeb

Browse files
committed
Group queue deletions on_node_down into 10 operations per transaction
When many queues are being deleted, we believe that it's faster to have fewer Mnesia transactions and therefore group 10 queue deletions into a single Mnesia transaction. This number (10) is arbitrary, we didn't try with a different number. Creating 1 Mnesia transaction for every queue deletion feels too many transaction, and having a single Mnesia transaction for all queue deletions is too few transactions. This felt like a sensible option. We cannot determine if this is a good change because rabbit_core_metrics:queue_deleted/1 takes the most time and obscures all observations. According to qcachegrind, rabbit_misc:execute_mnesia_transaction/1 takes 1.8s while rabbit_core_metrics:queue_deleted/1 takes 132s out of which ets:select/2 takes 131s. How can we optimise rabbit_core_metrics:queue_deleted/1 ? We are thinking that rather than calling ets:select/2 twice for every queue, we should call it twice for all queues that need to be deleted. We don't know whether this is possible. Alternatively, we might look into ets:first/1 & ets:next/2 to iterate over the entire table ONCE with all the queues that have been deleted. Thoughts @dcorbacho @michaelklishin ? For initial context, see #1513 Partner-in-crime: @essen
1 parent a94faec commit 4927aeb

File tree

1 file changed

+16
-18
lines changed

1 file changed

+16
-18
lines changed

src/rabbit_amqqueue.erl

Lines changed: 16 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1114,28 +1114,20 @@ maybe_clear_recoverable_node(Node,
11141114
end.
11151115

11161116
on_node_down(Node) ->
1117-
% Create 1 transaction per N queues that need to be deleted
1118-
% * 1 transaction for all queues might block everything for a really long time
1119-
% * ^^^ this is what used to happen before this change
1120-
% * 1 transaction per queue will result in too many transaction
1121-
% * ^^^ this is what happens now; it's not perfect, but it's a step in the right direction
1122-
% * ^^^ OPTIMISE THIS BEFORE MERGING ^^^
1123-
% * Maybe 1 transaction for every 10 queues that need to be deleted ?
1124-
%
11251117
% For each transaction:
11261118
% * delete all queues in the transaction
11271119
% * capture the result for every delete queue
11281120
[
11291121
rabbit_misc:execute_mnesia_tx_with_tail(
1130-
fun () -> Dels = [delete_queue(Q)],
1131-
T = rabbit_binding:process_deletions(
1122+
fun () -> QueueDeletions = [delete_queue(Queue) || Queue <- Queues],
1123+
NotifyBindingDeletions = rabbit_binding:process_deletions(
11321124
lists:foldl(fun rabbit_binding:combine_deletions/2,
1133-
rabbit_binding:new_deletions(), Dels),
1125+
rabbit_binding:new_deletions(), QueueDeletions),
11341126
?INTERNAL_USER),
11351127
fun () ->
1136-
T(),
1128+
NotifyBindingDeletions(),
11371129
lists:foreach(
1138-
fun(QName) ->
1130+
fun(Queue) ->
11391131
% When 40k queues are being deleted,
11401132
% this results in a rabbit_node_monitor function that recurses for 30 minutes,
11411133
% meaning that no information is available for the node (Management Overview doesn't update):
@@ -1154,14 +1146,20 @@ on_node_down(Node) ->
11541146
% [{file,"src/rabbit_node_monitor.erl"},{line,755}]},
11551147
% {rabbit_node_monitor,handle_info,2,
11561148
% [{file,"src/rabbit_node_monitor.erl"},{line,548}]}]}
1157-
rabbit_core_metrics:queue_deleted(QName),
1149+
rabbit_core_metrics:queue_deleted(Queue),
11581150
ok = rabbit_event:notify(queue_deleted,
1159-
[{name, QName},
1151+
[{name, Queue},
11601152
{user, ?INTERNAL_USER}])
1161-
end, [Q])
1153+
end, Queues)
11621154
end
1163-
end) || Q <- queues_to_delete_from_node_down(Node)
1164-
].
1155+
end) || Queues <- partition_queues(queues_to_delete_from_node_down(Node))
1156+
],
1157+
ok.
1158+
1159+
partition_queues([Q0,Q1,Q2,Q3,Q4,Q5,Q6,Q7,Q8,Q9 | T]) ->
1160+
[[Q0,Q1,Q2,Q3,Q4,Q5,Q6,Q7,Q8,Q9] | partition_queues(T)];
1161+
partition_queues(T) ->
1162+
[T].
11651163

11661164
delete_queue(QueueName) ->
11671165
ok = mnesia:delete({rabbit_queue, QueueName}),

0 commit comments

Comments
 (0)