-
Notifications
You must be signed in to change notification settings - Fork 9.4k
Cron starts when it's already running #10650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Note to anybody working on this: ticket #10279 was closed as a dup of this but contains additional details about the root cause that I had already tracked down. |
@kanduvisla, thank you for your report. |
Hey there. I'm working on it #SQUASHTOBERFEST |
@gabrielqs-redstage Sorry I was already working on it yesterday and sent the PR. It was not marked bc I forgot it while the ticket was open on my side. |
@ishakhsuvarov can you please assign this one to @diglin instead? |
@gabrielqs-redstage reassigned. |
Hi, Does anyone know why concurrent tasks with same job_code is not prevented by the following:
Query that was generated by the test was this
It seems designed to prevent two concurrently running tasks with the same job code. |
@s-hoffman this is in progress in #12497 |
@elvinristi, as I understand that ticket, it is adding MySql based locking around the cron group's execution and does not explain why trySetJobUniqueStatusAtomic is failing to catch this error. |
@s-hoffman I think what Magento does in $schedule->tryLockJob() is locking individual scheduled execution run of a job, so that specific job run is not triggered again. What it does not prevent is running the same job code (but different scheduled execution run) at the same time, and it was never intended to do so. $schedule is referring to a row in cron_schedule table, where there are multiple rows per job code. |
@s-hoffman random thought but does your database contain more than 1 schedule for the given status? I wonder if the code fails if there is already more than one entry in the database. |
@paveq, Not [running the same job code (but different scheduled execution run) at the same time]? The Sql that is generated by the method is:
This clearly shows intent to check for currently running schedules, without checking scheduled time or any other timestamp field.
The comment also indicates an intent to implement locking. Also, there are integration tests in .\dev\tests\integration\testsuite\Magento\Cron\Model\ScheduleTest.php The above description does include a temporal component. (But after 5 minutes, the other job). Otherwise the tests seem to cover the scenario described above. @dmanners, I can some tests on my local using the SQL statement above, and tried to add more schedules. I was unable to replicate it by adding extra jobs, though it was guesswork testing. (low cost low time investment). |
@s-hoffman Indeed, it looks like this was introduced in 2.2 here eacb702 Either it does not work, or if it does it causes even larger issues. If we blindly check for "running" status, since crashed job will cause that job to never run again until the row is removed from schedule table. In my opinion MySQL based locking which gets released when connection closes is better solution here, as it actually depends on the running process. |
@paveq Actually I just had an issue, where some cronjobs didn't start after updating from 2.1 to 2.2, because there where old entrys of failed jobs with running status in the schedule table. |
@paveq, This ticket is currently labeled_ 'Reproduced on 2.2.x'. |
@s-hoffman Yes, I can confirm we can reproduce the issue on a 2.2.2 EE. |
This issue also exists in CE 2.2.x. Its hilariously server breaking. |
What's the state of this? We have 2.2.2. CE running and cron is not running correctly and needs to be fixed asap. |
We wrote an extension to fix these bugs, speed up performance, and control the execution of tasks: https://github.com/magemojo/m2-ce-cron |
@ericvhileman why have you decided to create separate module instead creating PR with all your improvements? |
@ihor-sviziev , I can just assume but I saw tickets and problems with already created Pull Requests being ignored for months. And as we know, major bugfixes often just get into the next major release, which happens once a year. As users working with production environments, we need solutions asap. |
@johnny-longneck thank you for clarifying. Specially I'd like to add these fixes directly into magento, it will fix issues for a much more people than separate module. Is there any list of issues and how they were fixed in your module? PS: specially I'm creating PR with all needed changes and use following article to apply these changes into our projects: https://twitter.com/s3lf/status/957993636058288128. As result - these changes will be merged at some point of time and we're getting issue fixed on production right now. Also if you have any issues with long process of merging - you can contact community maintainers in slack or just mention in comment, it will help to speedup this process. |
Yes most people agree @johnny-longneck , waiting on Magento to release a fix is a complete waste of time when problems exist in the present, not the next release so we move on. Instead we get more 3rd party stuff added in releases which is basically an advertisement for a given vendor. |
We're happy to work with the core team to get this merged into the core. We will be at Imagine in April at the hackathon and will discuss it further with the comm eng team. See 1. |
I noticed that when I start a cron that's already running it will run again. For example: I have a task that takes 15 minutes, but the cron executes it's every 5 minutes. In the database the cron gets the status
running
and the other scheduled jobs havepending
. But after 5 minutes, the other job (with the same code) also starts to run. So now I have the same job running twice!Preconditions
Steps to reproduce
Expected result
The other jobs should wait for the first job to complete, even though they are scheduled to run every 5 minutes.
Actual result
The new job starts, ignoring the fact that it's already running...
I'm not sure if this is intended behavior or a bug. Am I expected to create my own lock / flag for this?
The text was updated successfully, but these errors were encountered: