-
Notifications
You must be signed in to change notification settings - Fork 152
Proposal for some consumer processes improvements #232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for some consumer processes improvements #232
Conversation
design-documents/asynchronous-operations/consumer-processes-improvements.md
Outdated
Show resolved
Hide resolved
design-documents/asynchronous-operations/consumer-processes-improvements.md
Outdated
Show resolved
Hide resolved
design-documents/asynchronous-operations/consumer-processes-improvements.md
Show resolved
Hide resolved
design-documents/asynchronous-operations/consumer-processes-improvements.md
Outdated
Show resolved
Hide resolved
|
||
These options should be configurable per consumer type. | ||
Some sensible defaults could be set in the [`queue_consumer.xml`](https://devdocs.magento.com/guides/v2.3/extension-dev-guide/message-queues/config-mq.html#queueconsumerxml) file for some of these options. | ||
Next to that, developers or shopowners should be able to override these values per consumer type. At least being able to override it in the `app/etc/env.php` file would be nice, but a backend interface for making these things configurable would also be very nice (but can maybe be done in a later phase?). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with this point. Will be nice to have UI for available consumers, their shedule and even about amount of processed messages
7146faa
to
653ecd4
Compare
These options should be configurable per consumer type. | ||
Some sensible defaults could be set in the [`queue_consumer.xml`](https://devdocs.magento.com/guides/v2.3/extension-dev-guide/message-queues/config-mq.html#queueconsumerxml) file for some of these options. | ||
Next to that, developers or shopowners should be able to override these values per consumer type. At least being able to override them in the `app/etc/env.php` file would be nice, but a backend interface for making these things configurable would also be very nice (but can maybe be done in a later phase?). | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only-spawn-when-message-available
max-idle-time
These are good options, and the combination should work
|
||
The suggestion here would be to give deployment scripts the option to signal the running consumer proccesses to handle their current messages and then kill themselves as soon as possible. | ||
We already have the poison pill functionality which could be used here, but the problem with that one is that it will only be checked when a new message appears in the queue. If no messages appear in the queue during a deploy, and we only update the poison pill version, the consumer will still keep running because it doesn't check for the poison pill version until a new message appears. | ||
So I would suggest some kind of dummy message to be created in each queue which only purpose is for the consumers to check the poison pill version and that message then should get removed again from the queue and nothing should be done with it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If 'max-idle-time' is implemented as suggested above, then this new message to each queue might not be required; since after specified max idle time, the consumer process will stop by itself; and then rerun when the new message arrives by 'consumer_runner'. So it should solve the underlying issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true on the one hand, but on the other, there maybe cases where you explicitly want to stop consumer processes at a specific time, for example during a deploy to put a new version of the code online.
I know some people stop the crontab during deploys, so when you kill consumers using this new command, and crons are disabled, no new ones will spawn until the crons are enabled again after the deploy was finished.
It "feels" more clean to do a deploy in this way.
But personally I'm fine with leaving this out if it's not wanted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hostep that's right, with the existing poison pill functionality the consumer would stop for the indicated code changes. And then we can follow the existing process as you mentioned disabling the crons until the deployment is finished.
@nuzil just to confirm, this proposed solution 'only-spawn-when-message-available' for 'consumers_runner' is something that can be done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving this in here for now, it doesn't sound like there is a consensus reached yet.
Let me know if I misunderstood this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me confirm and get to some consensus with @nuzil ; the overall approach seems to work fine, just want to make sure, we are all on same page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't realize we still need to solve problem 2 for people not wanting to use the max-idle-time
flag. So I've added it again and removed the dummy message part.
Is this ok now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hostep I would agree with you on this, 'max-idle-time' would be more flexible approach to dispose the currently running queue process; If its sets to '0' then it could potentially act like no waiting; so it gives flexibility to the site administrator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed.
Also, be aware that the proposal is for these configuration options to exist per consumer type. The new consumers_wait_for_messages
configuration option on the other hand is now implemented to act globally for all consumers, which is also not flexible enough in my opinion.
So it looks like most of it is now clear and we are on the same page, what are the next steps to move this proposal forward?
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hostep I can help this move forward, although wanted to check how we are planning to implement this feature as part of ConsumersRunner job, how much refactoring this may require? wanted to do quick feasibility check, since ConsumersRunner is not dealing with messages directly as of now. Do you have some strategy in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for bringing this up during the weekly architectural meeting, you explained it very well! 🙂
As for the implementation details, I haven't looked into that yet since I'm not super familiar with the code (yet). If you do want some help looking into this, let me know (I'm also available on Slack most of the time if that makes it easier to communicate).
…ion about MC-19250. Some other small improvements.
design-documents/asynchronous-operations/consumer-processes-improvements.md
Outdated
Show resolved
Hide resolved
…ll version, removed the dummy message suggestion.
5f193b7
to
6aeb6b3
Compare
Note: This is a tangent about naming, not about the proposal which I'm still digesting ... I'm just glancing through this, and I'm surprised that we have a table named C'mon, we're writing code here that we want to be simple, clear, and as unambiguous as possible, not a dramatic novel. How about just Keep it clear and simple. Words in our code and interfaces matter, and Magento can definitely improve both. |
Thanks for the feeback @keithbentrup! Totally makes sense to keep wording simple. This was just a proposal, nothing is set in stone as of yet (I hope?), it was certainly not my intention to force the naming I came up with to actually being used, that's up to the persons actually writing out the specs or implementing it (I think?) |
… and new CLI command.
thanks @hostep for all your effort, |
|
||
The problem with this option, is that every time the cronjob `consumers_runner` runs, it will spawn a new consumer process, the consumer checks if messages are available and if none found it will kill itself. So this means it will spawn unneeded processes which take up memory, live for a very short period and then disappear again. | ||
|
||
In this proposal, the flag `only-spawn-when-message-available` would run logic to check *before* the consumer process gets spawned and not *after*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked into implementing only-spawn-when-message-available and it seems it would require one of the following:
- Add and implement for both queue types a function that can get a count of the queued messages.
- Just dequeue a message in StartConsumerCommand core the consumer if we get no message then don't spawn a process. If we do get a message, requeue it using the queue reject function then spawn a process to handle it.
option 2 would be easier to implement but not ideal behavior.
The mysql queue implementation for Queue::getCount($status) would be easy to implement
It would take the statuses codes from the manager:
MESSAGE_STATUS_NEW = 2;
MESSAGE_STATUS_IN_PROGRESS = 3;
MESSAGE_STATUS_COMPLETE= 4;
MESSAGE_STATUS_RETRY_REQUIRED = 5;
MESSAGE_STATUS_ERROR = 6;
MESSAGE_STATUS_TO_BE_DELETED = 7;
The Amqp would be a little harder to implement
It doesn't have good translations for those statuses, so maybe the function would just have to be
Queue::getPendingCount()
The PhpAmqpLib\Channel\AMQPChannel class does not seem to implement a way to get a count of pending messages in a channel however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Luwdo!
My bad, I forgot to mention this over here on Github (this was done in a conversation on Slack), but I created a proof of concept just for this:
hostep/magento2@d93d288...poc-consumer-improvements
The only problem with this is that it introduces a new method to an interface, which is considered backwards incompatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that getMessages and basic_get cause the message to get locked because you are claiming the message for your to process?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this and that doesn't seem to be the case, the messages still remain available in the queue after those calls. But feel free to test as well 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Must be something else that locks the message. I can give this a go in a development enviorment, I might replace my patch this this option instead. This is a way better option.
Just for the record, a proof of concept for the |
Problem
The message queue consumer processes can use some improvements.
As requested by @maghamed, here's an architectural proposal based on
Most of this document was written from memory from what I've seen a couple of weeks ago. Therefore there can potentially be some factual mistakes in here, if that's the case, please let me know!
Solution
See the document, it provides some potential solutions.
Requested Reviewers
@nuzil, @paliarush