Optimise HTTP API /queues endpoint #9874

kjnilsson · 2023-11-03T13:36:02Z

Listing queues with the HTTP API when there are many (1000s) of
quorum queues could be excessively slow compared to the same scenario
with classic queues.

This optimises various aspects of HTTP API queue listings.
For QQs it removes the expensive cluster wide rpcs used to get the
"online" status of each quorum queue. This was previously done before
paging and thus would perform a cluster-wide query for each quorum queue in
the vhost/system. This accounted for most of the slowness compared to
classic queues.

Secondly the query to separate the running from the down queues
consisted of two separate queries that later were combined when a single
query would have sufficed.

This commit also includes a variety of other improvements and minor
fixes discovered during testing and optimisation.

MINOR BREAKING CHANGE: quorum queues would previously only display one
of two states: running or down. Now there is a new state called minority
which is emitted when the queue has at least one member running but
cannot commit entries due to lack of quorum.

Also the quorum queue may transiently enter the down state when a node
goes down and before its elected a new leader.

kjnilsson · 2023-11-06T14:50:05Z

5k quorum queues on a 3 node dev box.

On this branch:

karl@crake:~$ time curl 'http://crake:15672/api/queues?page=25&page_size=100&name=&use_regex=false&pagination=true' -H 'authorization: Basic Z3Vlc3Q6Z3Vlc3Q=' &>/dev/null

real	0m0.080s
user	0m0.003s
sys	0m0.007s

On main:

karl@crake:~$ time curl 'http://crake:15672/api/queues?page=25&page_size=100&name=&use_regex=false&pagination=true' -H 'authorization: Basic Z3Vlc3Q6Z3Vlc3Q=' &>/dev/null

real	0m1.231s
user	0m0.000s
sys	0m0.006s

I expect on a system with a real network and some load to benefit even more.

Listing queues with the HTTP API when there are many (1000s) of quorum queues could be excessively slow compared to the same scenario with classic queues. This optimises various aspects of HTTP API queue listings. For QQs it removes the expensive cluster wide rpcs used to get the "online" status of each quorum queue. This was previously done _before_ paging and thus would perform a cluster-wide query for _each_ quorum queue in the vhost/system. This accounted for most of the slowness compared to classic queues. Secondly the query to separate the running from the down queues consisted of two separate queries that later were combined when a single query would have sufficed. This commit also includes a variety of other improvements and minor fixes discovered during testing and optimisation. MINOR BREAKING CHANGE: quorum queues would previously only display one of two states: running or down. Now there is a new state called minority which is emitted when the queue has at least one member running but cannot commit entries due to lack of quorum. Also the quorum queue may transiently enter the down state when a node goes down and before its elected a new leader.

To allow callers to specify a subset of fields they'd like.

There is no need to list all queues to check if the vhost exists.

tubemeister · 2023-11-07T13:51:42Z

Thanks, that should make a big difference. A 10s timeout is workable, a 1m timeout isn't.

Any chance of this getting a backport to 3.12?

(PS: I've been locked out of my own bugreport. I've been getting more responses there from both of you but can't respond.)

michaelklishin · 2023-11-07T20:55:01Z

There is no consensus on whether we want to backport this. #9892 is ready for review but we make no promises.

michaelklishin · 2023-11-08T14:23:54Z

It was decided that this should be a 3.13-specific change.

tubemeister · 2023-11-09T09:29:03Z

Yeah I suspected as much from the "breaking change" aspect, that makes sense.

Thanks anyway, something to look forward to in 3.13. ;-)

tubemeister · 2024-02-28T10:27:56Z

Can confirm, the long timeout when a node is offline is now fixed in 3.13.0

tubemeister · 2024-02-28T10:29:59Z

Might be useful to note this as fixed this in #9522 as well. I'd do it myself but I've been locked out of my own bugreport. ;-)

mkuratczyk · 2024-02-28T10:55:35Z

done

kjnilsson changed the title ~~Optimise mgmt HTTP API /queues endpoint~~ Optimise HTTP API /queues endpoint Nov 3, 2023

michaelklishin added backport-v3.11.x and removed backport-v3.11.x labels Nov 3, 2023

kjnilsson force-pushed the opt-mgmt-queue-listings branch from ea0adf8 to f2ab9b4 Compare November 6, 2023 14:44

kjnilsson marked this pull request as ready for review November 6, 2023 14:44

kjnilsson force-pushed the opt-mgmt-queue-listings branch from f2ab9b4 to 7935e4d Compare November 6, 2023 14:44

kjnilsson added this to the 3.13.0 milestone Nov 6, 2023

kjnilsson force-pushed the opt-mgmt-queue-listings branch from 7935e4d to 64494e6 Compare November 6, 2023 15:33

kjnilsson force-pushed the opt-mgmt-queue-listings branch from 64494e6 to c2cd60b Compare November 6, 2023 15:34

kjnilsson added 4 commits November 7, 2023 08:48

tweak

d77f937

Expose rabbit_quorum_queue:infos/2

9cf6454

To allow callers to specify a subset of fields they'd like.

HTTP API /queues optimise resource_exists

ff12d3b

There is no need to list all queues to check if the vhost exists.

use right dummy type

baff660

michaelklishin added backport-v3.11.x labels Nov 7, 2023

michaelklishin approved these changes Nov 7, 2023

View reviewed changes

michaelklishin merged commit e527720 into main Nov 7, 2023

michaelklishin deleted the opt-mgmt-queue-listings branch November 7, 2023 19:30

mergify bot mentioned this pull request Nov 7, 2023

Optimise HTTP API /queues endpoint (backport #9874) #9892

Closed

michaelklishin removed the backport-v3.11.x label Nov 7, 2023

kjnilsson mentioned this pull request Nov 8, 2023

Fix streams minority calculation #9896

Merged

michaelklishin removed the backport-v3.12.x label Nov 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimise HTTP API /queues endpoint #9874

Optimise HTTP API /queues endpoint #9874

Uh oh!

kjnilsson commented Nov 3, 2023 •

edited

Loading

Uh oh!

kjnilsson commented Nov 6, 2023 •

edited

Loading

Uh oh!

tubemeister commented Nov 7, 2023

Uh oh!

michaelklishin commented Nov 7, 2023

Uh oh!

michaelklishin commented Nov 8, 2023

Uh oh!

tubemeister commented Nov 9, 2023

Uh oh!

tubemeister commented Feb 28, 2024

Uh oh!

tubemeister commented Feb 28, 2024

Uh oh!

mkuratczyk commented Feb 28, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Optimise HTTP API /queues endpoint #9874

Optimise HTTP API /queues endpoint #9874

Uh oh!

Conversation

kjnilsson commented Nov 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kjnilsson commented Nov 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tubemeister commented Nov 7, 2023

Uh oh!

michaelklishin commented Nov 7, 2023

Uh oh!

michaelklishin commented Nov 8, 2023

Uh oh!

tubemeister commented Nov 9, 2023

Uh oh!

tubemeister commented Feb 28, 2024

Uh oh!

tubemeister commented Feb 28, 2024

Uh oh!

mkuratczyk commented Feb 28, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kjnilsson commented Nov 3, 2023 •

edited

Loading

kjnilsson commented Nov 6, 2023 •

edited

Loading