-
Notifications
You must be signed in to change notification settings - Fork 7
feat(reliability) Add metrics #91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds comprehensive Prometheus metrics collection capabilities to the Pyth Observer system. The implementation includes metrics for price feeds, publisher states, API performance, alerts, and system health monitoring.
- Introduces a centralized
PythObserverMetrics
class with 15+ metric types covering all aspects of the observer system - Replaces existing simple gauge metrics in dispatch.py with comprehensive metric collection and success rate tracking
- Adds instrumentation throughout the observer lifecycle to capture API request timings, error rates, and system status
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
File | Description |
---|---|
pyth_observer/metrics.py | Defines the complete metrics collection system with gauges, counters, and histograms for all observer operations |
pyth_observer/dispatch.py | Replaces basic gauge metrics with comprehensive check execution timing and success rate tracking |
pyth_observer/init.py | Adds metrics instrumentation to API calls, price feed processing, and error handling in the main observer loop |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
pyth_observer/__init__.py
Outdated
metrics.loop_errors_total.labels(error_type=type(e).__name__).inc() | ||
|
||
logger.debug("Sleeping...") | ||
metrics.observer_ready = 0 |
Copilot
AI
Oct 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The metrics.observer_ready
is set to 0 twice in a row (lines 208 and 211), which is redundant and likely indicates a copy-paste error.
metrics.observer_ready = 0 |
Copilot uses AI. Check for mistakes.
pyth_observer/metrics.py
Outdated
self.observer_up = 1 | ||
self.observer_ready = 0 |
Copilot
AI
Oct 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting instance attributes observer_up
and observer_ready
on the metrics class will not update the actual Prometheus metrics. These should use the gauge's .set()
method instead: self.observer_ready.set(0)
self.observer_up = 1 | |
self.observer_ready = 0 | |
self.observer_up.set(1) | |
self.observer_ready.set(0) |
Copilot uses AI. Check for mistakes.
# global states | ||
states = [] | ||
while True: | ||
try: |
Copilot
AI
Oct 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The states
list is initialized inside the while loop scope but declared outside it. This will cause the list to accumulate data across iterations, leading to memory growth and incorrect metrics. Move this initialization inside the while loop.
# global states | |
states = [] | |
while True: | |
try: | |
while True: | |
try: | |
states = [] |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no it won't. L107 will clear it
self.alerts_active.labels(alert_type=alert_type).set(count) | ||
|
||
if sent_alert: | ||
alert_type = sent_alert.split("-")[0] |
Copilot
AI
Oct 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code assumes sent_alert
contains a dash character without validation. If sent_alert
doesn't contain a dash, split('-')[0]
will return the entire string, but this could lead to unexpected behavior. Consider adding validation or using a more robust parsing method.
alert_type = sent_alert.split("-")[0] | |
if "-" in sent_alert: | |
alert_type = sent_alert.split("-", 1)[0] | |
else: | |
alert_type = sent_alert # or use "unknown" if preferred |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have faith.
metrics.alerts_sent_total.labels( | ||
alert_type=info["type"], | ||
channel=event_type.lower().replace("event", ""), | ||
).inc() |
Copilot
AI
Oct 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable event_type
is not defined in this scope. This will cause a NameError when this code path is executed.
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
umm i think it is tho
Add metrics