-
Notifications
You must be signed in to change notification settings - Fork 832
Alertmanager: Remove outdated comment #3216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[...] wanted to see whenever this was an acceptable solution - An alternative here is that we stop the Alertmanager completely if we fail to load the last configuration and symbolise the user the configuration did not work.
I have a question: how is the new metric different then what cortex_alertmanager_config_invalid
already does?
pkg/alertmanager/multitenant.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[open question] What if we keep the metric name specular with Prometheus AlertManager, so cortex_alertmanager_config_last_reload_success_timestamp_seconds
?
pkg/alertmanager/multitenant.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below we do am.multitenantMetrics.invalidConfig.DeleteLabelValues(user)
. Shouldn't we also do it for this new metric?
@gotjosh Is there still interest in this PR? |
This was superseded by #3289 but the comments are still relevant - I'll modify the PR, as I think the comments are very relevant for upcoming work. |
Signed-off-by: gotjosh <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// 1) the user had a previous alertmanager | ||
// 2) then, submitted a non-working configuration (and we kept running the prev working config) | ||
// 3) finally, the cortex AM instance is restarted and the running version is no longer present | ||
if userAmConfig == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion(non-blocking,if-minor): Since this check has been moved from transformConfig
consider removing the transformConfig
function entirely an moving the for loop logic from the function here. transformConfig
is only called here so it should be safe to move and I think it would still be quite readable as part of a single function call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Signed-off-by: gotjosh <[email protected]>
@pracucci This should be good to go! |
What this PR does:
Adds a new metric to keep track of when was the last uploaded configuration successfully applied to the Alertmanager. Also, improves documentation.
Still needs a test but wanted to see whenever this was an acceptable solution - An alternative here is that we stop the Alertmanager completely if we fail to load the last configuration and symbolise the user the configuration did not work.
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]