Skip to content

Proposal for some consumer processes improvements #232

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Sep 9, 2019

Conversation

hostep
Copy link
Contributor

@hostep hostep commented Aug 19, 2019

Problem

The message queue consumer processes can use some improvements.
As requested by @maghamed, here's an architectural proposal based on

Most of this document was written from memory from what I've seen a couple of weeks ago. Therefore there can potentially be some factual mistakes in here, if that's the case, please let me know!

Solution

See the document, it provides some potential solutions.

Requested Reviewers

@nuzil, @paliarush

@magento-cicd2
Copy link

magento-cicd2 commented Aug 19, 2019

CLA assistant check
All committers have signed the CLA.


These options should be configurable per consumer type.
Some sensible defaults could be set in the [`queue_consumer.xml`](https://devdocs.magento.com/guides/v2.3/extension-dev-guide/message-queues/config-mq.html#queueconsumerxml) file for some of these options.
Next to that, developers or shopowners should be able to override these values per consumer type. At least being able to override it in the `app/etc/env.php` file would be nice, but a backend interface for making these things configurable would also be very nice (but can maybe be done in a later phase?).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with this point. Will be nice to have UI for available consumers, their shedule and even about amount of processed messages

@hostep hostep force-pushed the consumer-processes-improvements branch from 7146faa to 653ecd4 Compare August 25, 2019 14:28
These options should be configurable per consumer type.
Some sensible defaults could be set in the [`queue_consumer.xml`](https://devdocs.magento.com/guides/v2.3/extension-dev-guide/message-queues/config-mq.html#queueconsumerxml) file for some of these options.
Next to that, developers or shopowners should be able to override these values per consumer type. At least being able to override them in the `app/etc/env.php` file would be nice, but a backend interface for making these things configurable would also be very nice (but can maybe be done in a later phase?).

Copy link
Contributor

@tariqjawed83 tariqjawed83 Aug 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • only-spawn-when-message-available
  • max-idle-time

These are good options, and the combination should work


The suggestion here would be to give deployment scripts the option to signal the running consumer proccesses to handle their current messages and then kill themselves as soon as possible.
We already have the poison pill functionality which could be used here, but the problem with that one is that it will only be checked when a new message appears in the queue. If no messages appear in the queue during a deploy, and we only update the poison pill version, the consumer will still keep running because it doesn't check for the poison pill version until a new message appears.
So I would suggest some kind of dummy message to be created in each queue which only purpose is for the consumers to check the poison pill version and that message then should get removed again from the queue and nothing should be done with it.
Copy link
Contributor

@tariqjawed83 tariqjawed83 Aug 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If 'max-idle-time' is implemented as suggested above, then this new message to each queue might not be required; since after specified max idle time, the consumer process will stop by itself; and then rerun when the new message arrives by 'consumer_runner'. So it should solve the underlying issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true on the one hand, but on the other, there maybe cases where you explicitly want to stop consumer processes at a specific time, for example during a deploy to put a new version of the code online.
I know some people stop the crontab during deploys, so when you kill consumers using this new command, and crons are disabled, no new ones will spawn until the crons are enabled again after the deploy was finished.
It "feels" more clean to do a deploy in this way.

But personally I'm fine with leaving this out if it's not wanted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hostep that's right, with the existing poison pill functionality the consumer would stop for the indicated code changes. And then we can follow the existing process as you mentioned disabling the crons until the deployment is finished.

@nuzil just to confirm, this proposed solution 'only-spawn-when-message-available' for 'consumers_runner' is something that can be done?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving this in here for now, it doesn't sound like there is a consensus reached yet.
Let me know if I misunderstood this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me confirm and get to some consensus with @nuzil ; the overall approach seems to work fine, just want to make sure, we are all on same page.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't realize we still need to solve problem 2 for people not wanting to use the max-idle-time flag. So I've added it again and removed the dummy message part.

Is this ok now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hostep I would agree with you on this, 'max-idle-time' would be more flexible approach to dispose the currently running queue process; If its sets to '0' then it could potentially act like no waiting; so it gives flexibility to the site administrator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed.

Also, be aware that the proposal is for these configuration options to exist per consumer type. The new consumers_wait_for_messages configuration option on the other hand is now implemented to act globally for all consumers, which is also not flexible enough in my opinion.

So it looks like most of it is now clear and we are on the same page, what are the next steps to move this proposal forward?

Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hostep I can help this move forward, although wanted to check how we are planning to implement this feature as part of ConsumersRunner job, how much refactoring this may require? wanted to do quick feasibility check, since ConsumersRunner is not dealing with messages directly as of now. Do you have some strategy in mind?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tariqjawed83

Thanks for bringing this up during the weekly architectural meeting, you explained it very well! 🙂

As for the implementation details, I haven't looked into that yet since I'm not super familiar with the code (yet). If you do want some help looking into this, let me know (I'm also available on Slack most of the time if that makes it easier to communicate).

…ion about MC-19250. Some other small improvements.
…ll version, removed the dummy message suggestion.
@hostep hostep force-pushed the consumer-processes-improvements branch from 5f193b7 to 6aeb6b3 Compare September 1, 2019 15:46
@tariqjawed83 tariqjawed83 mentioned this pull request Sep 3, 2019
@keithbentrup
Copy link

Note: This is a tangent about naming, not about the proposal which I'm still digesting ...

I'm just glancing through this, and I'm surprised that we have a table named queue_poison_pill (who came up with that and what were they thinking?), and then someone was proposing reusing the poison pill name in the CLI? Thankfully, using "suicide" in the CLI was already quashed.

C'mon, we're writing code here that we want to be simple, clear, and as unambiguous as possible, not a dramatic novel. How about just queue_restart_version? A poison pill would presumably kill something, and there's no connotation that it would be "reborn" (to continue anthropomorphizing - which I don't think we should).

Keep it clear and simple. Words in our code and interfaces matter, and Magento can definitely improve both.

@hostep
Copy link
Contributor Author

hostep commented Sep 5, 2019

Thanks for the feeback @keithbentrup!

Totally makes sense to keep wording simple. This was just a proposal, nothing is set in stone as of yet (I hope?), it was certainly not my intention to force the naming I came up with to actually being used, that's up to the persons actually writing out the specs or implementing it (I think?)

@tariqjawed83
Copy link
Contributor

thanks @hostep for all your effort,

@tariqjawed83 tariqjawed83 merged commit a12415f into magento:master Sep 9, 2019

The problem with this option, is that every time the cronjob `consumers_runner` runs, it will spawn a new consumer process, the consumer checks if messages are available and if none found it will kill itself. So this means it will spawn unneeded processes which take up memory, live for a very short period and then disappear again.

In this proposal, the flag `only-spawn-when-message-available` would run logic to check *before* the consumer process gets spawned and not *after*.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into implementing only-spawn-when-message-available and it seems it would require one of the following:

  1. Add and implement for both queue types a function that can get a count of the queued messages.
  2. Just dequeue a message in StartConsumerCommand core the consumer if we get no message then don't spawn a process. If we do get a message, requeue it using the queue reject function then spawn a process to handle it.

option 2 would be easier to implement but not ideal behavior.

The mysql queue implementation for Queue::getCount($status) would be easy to implement

It would take the statuses codes from the manager:
MESSAGE_STATUS_NEW = 2;
MESSAGE_STATUS_IN_PROGRESS = 3;
MESSAGE_STATUS_COMPLETE= 4;
MESSAGE_STATUS_RETRY_REQUIRED = 5;
MESSAGE_STATUS_ERROR = 6;
MESSAGE_STATUS_TO_BE_DELETED = 7;

The Amqp would be a little harder to implement
It doesn't have good translations for those statuses, so maybe the function would just have to be
Queue::getPendingCount()
The PhpAmqpLib\Channel\AMQPChannel class does not seem to implement a way to get a count of pending messages in a channel however.

Copy link
Contributor Author

@hostep hostep Oct 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Luwdo!

My bad, I forgot to mention this over here on Github (this was done in a conversation on Slack), but I created a proof of concept just for this:
hostep/magento2@d93d288...poc-consumer-improvements

The only problem with this is that it introduces a new method to an interface, which is considered backwards incompatible.

Copy link

@Luwdo Luwdo Oct 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that getMessages and basic_get cause the message to get locked because you are claiming the message for your to process?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this and that doesn't seem to be the case, the messages still remain available in the queue after those calls. But feel free to test as well 🙂

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must be something else that locks the message. I can give this a go in a development enviorment, I might replace my patch this this option instead. This is a way better option.

@hostep
Copy link
Contributor Author

hostep commented Oct 17, 2019

Just for the record, a proof of concept for the only-spawn-when-message-available flag is available over here: hostep/magento2@d93d288...poc-consumer-improvements

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants