-
Notifications
You must be signed in to change notification settings - Fork 832
Expose hashsum of the config files #2874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
851840c
to
9dd5145
Compare
@annanay25 if you get the chance can you take a look at this? |
Thanks for your first PR! This looks like low-risk change, although I'm not sure what it's trying to solve. For admin trying to see if his Cortex instances use the same config file, I think pre-expansion hash would make tiny bit more sense. You make a good point about |
I think I have buried that in the commit message: This allows to monitor the roll-out of a new config file across the
That was my feeling, so I am assuming it's generally the node specific that's expanded using environment variables
I will have a look how this can be achieved, will come back to you on this |
9dd5145
to
8cb8d39
Compare
Name: "cortex_overrides_last_reload_successful", | ||
Help: "Whether the last config reload attempt was successful.", | ||
Name: "cortex_runtime_config_last_reload_successful", | ||
Help: "Whether the last runtime-config reload attempt was successful.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed that because i found this mismatch between CLI argument and metric name quite confusing.
Not too sure if this is subject to a deprecation process or if we just should use the cortex_overrides
for both
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's OK to change (if you mention it in CHANGELOG.md and update existing alerts too in https://github.com/grafana/cortex-jsonnet repo) but I'm not sure if metric names are actually part of our V1 guarantees. It's not mentioned in the docs/configuration/v1-guarantees.md doc. @gouthamve what do you say?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked with Goutham, we don't cover metrics in guarantees, so it's ok to rename it. (with changelog and alerts update)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, following this lets make the changes in the cortex-mixin as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@simonswine As Annanay correctly mentioned, please don't forget to open a PR to update the mixin once this PR will be merged, thanks! 🙏 🙏 🙏
I have updated the code to also support the runtime config. A couple of notes:
Would be great if I can get some input from one of you @annanay25 / @pstibrany |
Try to look for usage of
I think separate metrics is preferable in this case, since it's different components of the system.
Will try. |
Name: "cortex_overrides_last_reload_successful", | ||
Help: "Whether the last config reload attempt was successful.", | ||
Name: "cortex_runtime_config_last_reload_successful", | ||
Help: "Whether the last runtime-config reload attempt was successful.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's OK to change (if you mention it in CHANGELOG.md and update existing alerts too in https://github.com/grafana/cortex-jsonnet repo) but I'm not sure if metric names are actually part of our V1 guarantees. It's not mentioned in the docs/configuration/v1-guarantees.md doc. @gouthamve what do you say?
24e42d9
to
e83f01e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👋 @simonswine! First time contributio? Amazing! Looking forward for many more ❤️
Please remember to add a CHANGELOG
entry. The cortex_overrides_last_reload_successful
rename should be a [CHANGE]
, while other changes an [ENHANCEMENT]
(so two different entries). Also don't forget to add the PR number (look at other entries as an example).
e83f01e
to
0d43f84
Compare
I think I have addressed most of the feedback. Github will not show my pushes for a while, as they have some problems, but once you are on 696b9bfe9, I should have addressed most concerns |
0d43f84
to
9a9d77b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job on your 🥇 time contribution! LGTM and looking forward to many more 🚀
Name: "cortex_overrides_last_reload_successful", | ||
Help: "Whether the last config reload attempt was successful.", | ||
Name: "cortex_runtime_config_last_reload_successful", | ||
Help: "Whether the last runtime-config reload attempt was successful.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@simonswine As Annanay correctly mentioned, please don't forget to open a PR to update the mixin once this PR will be merged, thanks! 🙏 🙏 🙏
As part of cortexproject/cortex#2874 the metric was renamed, this PR renames the alert accordingly and supports both the old and the new metric name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you! (I've left some non-blocking nits.)
This allows to monitor the roll-out of a new config file across the cluster and can helps to detect a mismatch in active config files. It supports both start-up and runtime config. Signed-off-by: Christian Simon <[email protected]>
e0b8dd8
to
9c44169
Compare
I think most if not all feedback should be addressed now 🎉 The change in the jsonnet alerts has happened here: grafana/cortex-jsonnet#139 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you very much for your work and addressing our feedback!
Thanks for your responsive feedback. Have a great weekend 👋 |
This allow to monitor the roll out of new config file versions to the various nodes of a cluster. The metric was added as part of cortexproject/cortex#2874.
This allow to monitor the roll out of new config file versions to the various nodes of a cluster. The metric was added as part of cortexproject/cortex#2874.
This allow to monitor the roll out of new config file versions to the various nodes of a cluster. The metric was added as part of cortexproject/cortex#2874.
As part of cortexproject/cortex#2874 the metric was renamed, this PR renames the alert accordingly and supports both the old and the new metric name
This allow to monitor the roll out of new config file versions to the various nodes of a cluster. The metric was added as part of cortexproject/cortex#2874.
What this PR does:
This adds metrics for exposing the the sha256 hash of the config files used (before expansion). This allows to monitor the roll-out of a new config file across the cluster and can help to detect a mismatch in active configuration files.
The metric would look like this:
As no expansion is going on this matches with the local file:
Which issue(s) this PR fixes:
No issue yet
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]