-
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
[Feature] cron.update_mirrors - LIMIT_SIZE config parameter #16982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Worker number and queue number should help to control that but it's not direct. |
In 1.16 the mirror queue will be a proper queue and we would recommend that you use TYPE=level or TYPE=redis queue if you have a lot of mirrors. |
Sounds good. Is there an config example? I can't find nothing in https://github.com/go-gitea/gitea/blob/main/custom/conf/app.example.ini |
If you're happy to run redis for all your queues it would be as simple as: [queue]
TYPE=redis
CONN_STR=; as per docs To specifically make the mirror queue and the pull request task queues level queues then it's: [queue.pr_patch_checker]
TYPE=level
[queue.mirror]
TYPE=mirror That should do it. |
@zeripath currently I'm running my Gitea on an Mini-Server with 8GB RAM with PostgreSQL and Memcached, Nexus, ... . And if I will optimize my mirror process than I need Redis. In this case I can replace Memcached with Redis. Could be possible. Perhaps it will be better to reduce the amount of different tools which can be used with Gitea. ;) Cause not every tool can be used for some operations. And the development process will be easier. But if I set [queue] and my mirror cron runs every 24 hours than only 2000 oldest mirrors will be updated. Right? This means, that if I have 6000 mirrors after 3 days all mirrors will be up2date. Right? |
@somera if you don't want to use The problem with using a persistable-channel queue for the mirror queue is if you have 2001 mirrors and 2000 are queued, the 2001st request to push to the mirror queue will block. It is this blocking that is likely the cause your repeated issues of opened or stuck processes. Not every call to queue.Push() is async'd with |
ok. But what is with this question? Cause I try to understand this new functionality. It this what I "wanted" in my initial post? If I set [queue] and my mirror cron runs every 24 hours than only 2000 oldest mirrors will be updated. Right? This means, that if I have 6000 mirrors after 3 days all mirrors will be up2date. Right? |
Ah I think I now understand what you mean - you'd prefer to limit the number of mirrors added to the queue by cron.update_mirrors. OK let me take a look at that now. |
@zeripath right. I don't want update all the 6000 mirrors in one row. this need's ~3h at the moment. And I will be blocked on Github ... too many requests in xxx minutes. but split theam. On every update mirror cron call gite should update only xxxx oldest updated mirrors. |
Add `PULL_LIMIT` and `PUSH_LIMIT` to cron.update_mirror task to limit the number of mirrors added to the queue each time the cron task is run. Fix go-gitea#16982 Signed-off-by: Andrew Thornton <[email protected]>
@zeripath thx. If I set than (perhaps in 1.16.0)
than on update_mirrors cron only 1000 oldest mirrors will be updated? |
Each time the update_mirrors task is run only the oldest PULL_LIMIT pull mirrors and oldest PUSH_LIMIT push mirrors will be added to the queue. If the mirror is already in the queue it will not count towards the limit. So if the task limit is 3 say and you have repos A-N waiting to be updated and in increasing staleness, if A-E are already in the queue F, G and H will be added. |
@zeripath after upgrade to 1.16.0 the update mirror process isn't working like in 1.15.x anymore. See #18607 And I don't understand the new process. I did ~9000 curl api calls to Gitea:
If I repead the curl calls I see this
in the logs. And I set this:
But Gitea is not updating all the repos. Thy? When will Gitea update all the repos where the mirror.updated_unix date is older than one day? |
Description
The update_mirrors cron is updating all mirros (where updated_unix is ...) in one row. In my case I'm running the one once per day. That the cron needs ~2,5h to update all >6000 mirrors. And if Gitea is updating too much repos in one row GitHub is blocking Gite for some minutes. In this case Gitea gets 503 HTTP-ERROR
In this case I would preffer some config parameter where I can set the size of the current update_mirror cron.
In this case the update_mirrors cron will run every hour und call the update for the 50 oldest (select * from repository where is_mirror = true order by updated_unix asc limit 50) mirrors. And if LIMIT_SIZE is not set, then it will gets all mirrors in the right order.
Or is this now possible?
There is an MIRROR_QUEUE_LENGTH config parameter. But I didn't find the usage in code.
The text was updated successfully, but these errors were encountered: