Skip to content

Conversation

bplatak
Copy link
Contributor

@bplatak bplatak commented Oct 2, 2025

Add metrics

@bplatak bplatak requested a review from Copilot October 2, 2025 15:11
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds comprehensive Prometheus metrics collection capabilities to the Pyth Observer system. The implementation includes metrics for price feeds, publisher states, API performance, alerts, and system health monitoring.

  • Introduces a centralized PythObserverMetrics class with 15+ metric types covering all aspects of the observer system
  • Replaces existing simple gauge metrics in dispatch.py with comprehensive metric collection and success rate tracking
  • Adds instrumentation throughout the observer lifecycle to capture API request timings, error rates, and system status

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
pyth_observer/metrics.py Defines the complete metrics collection system with gauges, counters, and histograms for all observer operations
pyth_observer/dispatch.py Replaces basic gauge metrics with comprehensive check execution timing and success rate tracking
pyth_observer/init.py Adds metrics instrumentation to API calls, price feed processing, and error handling in the main observer loop

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

metrics.loop_errors_total.labels(error_type=type(e).__name__).inc()

logger.debug("Sleeping...")
metrics.observer_ready = 0
Copy link

Copilot AI Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metrics.observer_ready is set to 0 twice in a row (lines 208 and 211), which is redundant and likely indicates a copy-paste error.

Suggested change
metrics.observer_ready = 0

Copilot uses AI. Check for mistakes.

Comment on lines 153 to 154
self.observer_up = 1
self.observer_ready = 0
Copy link

Copilot AI Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting instance attributes observer_up and observer_ready on the metrics class will not update the actual Prometheus metrics. These should use the gauge's .set() method instead: self.observer_ready.set(0)

Suggested change
self.observer_up = 1
self.observer_ready = 0
self.observer_up.set(1)
self.observer_ready.set(0)

Copilot uses AI. Check for mistakes.

Comment on lines +83 to 86
# global states
states = []
while True:
try:
Copy link

Copilot AI Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The states list is initialized inside the while loop scope but declared outside it. This will cause the list to accumulate data across iterations, leading to memory growth and incorrect metrics. Move this initialization inside the while loop.

Suggested change
# global states
states = []
while True:
try:
while True:
try:
states = []

Copilot uses AI. Check for mistakes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no it won't. L107 will clear it

self.alerts_active.labels(alert_type=alert_type).set(count)

if sent_alert:
alert_type = sent_alert.split("-")[0]
Copy link

Copilot AI Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code assumes sent_alert contains a dash character without validation. If sent_alert doesn't contain a dash, split('-')[0] will return the entire string, but this could lead to unexpected behavior. Consider adding validation or using a more robust parsing method.

Suggested change
alert_type = sent_alert.split("-")[0]
if "-" in sent_alert:
alert_type = sent_alert.split("-", 1)[0]
else:
alert_type = sent_alert # or use "unknown" if preferred

Copilot uses AI. Check for mistakes.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have faith.

Comment on lines +235 to +238
metrics.alerts_sent_total.labels(
alert_type=info["type"],
channel=event_type.lower().replace("event", ""),
).inc()
Copy link

Copilot AI Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable event_type is not defined in this scope. This will cause a NameError when this code path is executed.

Copilot uses AI. Check for mistakes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

umm i think it is tho

@bplatak bplatak merged commit c5ecb0f into main Oct 2, 2025
3 checks passed
@bplatak bplatak deleted the feat/reliability/add-metrics-1 branch October 2, 2025 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants