-
-
Notifications
You must be signed in to change notification settings - Fork 60
Set up alert/notification when GitHub bot errored out #241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Currently there are three things you can do. One, try to make the bot do something again. 😉 Two, if you have admin rights on the repo in question you can check the webhooks history and see if the request got dropped on the floor (we have found our bots are more reliable than GitHub, though, so it's quite possible this request got mangled by GitHub or simply dropped on the floor by GH and never sent to the bot). And three, bot deployments for Ni and Bedevere are announced on python.zulipchat.com so you can see if it correlates with a very recent deployment (which happens after every merged PR so you can always check to see if that broke something as well). Otherwise I'm not quite sure what you might be after. Some basic log of the last 20 requests and whether there was any error? The problem with that is the bot is purposefully stateless, so having to store that would be unfortunate. Maybe if someone wanted to put in the effort to have a Redis cache or something that only tracked requests that people could look at? That way if the bot went down we wouldn't care about the data loss (although that's still work). |
I'm not sure what might help either :) Maybe a Zulip channel that the bots post activity updates to? It'd be pretty spammy so it would need to be a dedicated channel, but would give a place to go check and see if retrying the request is likely to achieve anything. |
One thing I normally do is to redeliver the webhook, and try to monitor the log in heroku at the same time. It seems in the end the label was applied correctly, so as Brett said, it could have been GitHub timeout issues. I think it will be a good idea to setup some sort of notification/alert whenever we receive 500 error from the webservice, so I'll update the description of this issue :) |
Trying new things here :) Any errors arised from miss-islington heroku app will be posted as comments to #257. |
Hah, nice :) |
@Mariatta if you want access to the Knights on Heroku just let me know. |
@brettcannon Sure please give me access to it. Thanks :) |
@Mariatta request sent to PSF infrastructure (turns out I don't have management rights for the app). |
Thank Brett. I've set up the alerts. Once a day, if there is any error in one of the bots, we'll receive the notification as comments in #257, #258, and #259. For those curious about how to set up the alert, these are my steps:
I'm just trying to find ways to get these alerts for free and without needing to write additional code 😛 I could have received email notification instead, but I figured we'll all have more visibility of the errors this way. Since we have notifications now, I'm closing this issue. |
I removed the CLA not signed marker from python/cpython#6699 (comment) a couple of hours ago, and whereas the knights-who-say-ni normally respond to that near-instantaneously, on this occasion, the bot hasn't responded at all.
Is there currently a way to check whether or not the bots are up and running as expected?
The text was updated successfully, but these errors were encountered: