Skip to content

Conversation

astubbs
Copy link
Contributor

@astubbs astubbs commented Mar 17, 2022

astubbs added a commit to astubbs/parallel-consumer that referenced this pull request Mar 22, 2022
…nfluentinc#237)

The simplest scenario of this issue, is:

using PARTITION ordering
max concurrency set to 2

- consumer is able to poll 10 messages, the first 5 for partition 0, the next 5 for partition 1
- when PC requests work from work manager, it first requests 2 work units
- work manager then ingests 2 units from it's queue, but to shard partition 0
- then shard iteration begins, partition order restriction is set, so only takes a single entry from the ONLY partition shard of zero
- so result is only a single record is returned, even through there are 5 in the queue for the 2nd partition that aren't "realised"
- on next request to WM, 1 more unit is requested to hit concurrency target - and this is also ingested from the 3 remaining for partition 0, and the rest for partition 1 are still undiscovered, and so on and so on - extrapolate out...
- solution:
- ingest the entire work queue into shards every round, which is effectively also what confluentinc#219

Also fixed by upcoming RR:

refactor: Queue unification confluentinc#219
astubbs added 7 commits April 4, 2022 18:35
The simplest scenario of this issue, is:

using PARTITION ordering
max concurrency set to 2

- consumer is able to poll 10 messages, the first 5 for partition 0, the next 5 for partition 1
- when PC requests work from work manager, it first requests 2 work units
- work manager then ingests 2 units from it's queue, but to shard partition 0
- then shard iteration begins, partition order restriction is set, so only takes a single entry from the ONLY partition shard of zero
- so result is only a single record is returned, even through there are 5 in the queue for the 2nd partition that aren't "realised"
- on next request to WM, 1 more unit is requested to hit concurrency target - and this is also ingested from the 3 remaining for partition 0, and the rest for partition 1 are still undiscovered, and so on and so on - extrapolate out...
- solution:
- ingest the entire work queue into shards every round, which is effectively also what #219

Also fixed by upcoming RR:

refactor: Queue unification #219
PriorityQueue only provides a sorted `poll`, whereas TreeSet iterates in sorted order.
@astubbs astubbs force-pushed the bugs/shard-starvation branch from 4414fd7 to 21b7a1b Compare April 7, 2022 08:53
@astubbs
Copy link
Contributor Author

astubbs commented Apr 22, 2022

Redundantly fixed by:

@astubbs astubbs closed this Apr 22, 2022
@astubbs astubbs deleted the bugs/shard-starvation branch April 22, 2022 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant