-
-
Notifications
You must be signed in to change notification settings - Fork 137
add indexer to cluster info and metrics #1281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add indexer to cluster info and metrics #1281
Conversation
add indexer server info in cluster info api response add indexer server metrics in cluster metrics api response works only with enterprise build
WalkthroughThe changes enhance the fetching of cluster information and metrics by implementing concurrency in the Changes
Suggested labels
Suggested reviewers
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (4)
🧰 Additional context used🧬 Code Definitions (1)src/handlers/http/cluster/utils.rs (1)
⏰ Context from checks skipped due to timeout of 90000ms (10)
🔇 Additional comments (23)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
src/metrics/prom_utils.rs (1)
159-215
: Revise error message to avoid referencing “ingestor info” generically.
When mapping the error at lines 213-214, you still log “failed to get ingestor info.” Since this function is now generic and may be called on any server implementingMetadata
, consider updating the message to “failed to get server info” to avoid confusion.- error!("Fatal: failed to get ingestor info: {:?}", err); + error!("Fatal: failed to get server info: {:?}", err);src/handlers/http/cluster/mod.rs (2)
580-637
: Check for potential invalid domain panic.
Currently, a malformed domain inserver.domain_name()
causes an immediate.expect("should always be a valid url")
at line 587. If domain names might ever be user-configured or come from external sources, consider error handling rather than panicking to improve system resilience.
639-657
: Parallel fetching is well-designed.
Usingstream::iter(...).buffer_unordered(servers_len)
achieves concurrency without an explicit limit. If the cluster can grow large, you may want to impose a cap to avoid saturating resources, but that may not be a concern in smaller deployments.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/handlers/http/cluster/mod.rs
(4 hunks)src/handlers/http/modal/mod.rs
(1 hunks)src/metrics/prom_utils.rs
(5 hunks)
🧰 Additional context used
🧬 Code Definitions (2)
src/metrics/prom_utils.rs (3)
src/handlers/http/modal/mod.rs (5)
new
(214-234)new
(389-409)domain_name
(553-553)domain_name
(558-560)domain_name
(568-570)src/parseable/mod.rs (1)
new
(140-162)src/metadata.rs (1)
new
(95-130)
src/handlers/http/cluster/mod.rs (2)
src/handlers/http/modal/mod.rs (5)
new
(214-234)new
(389-409)domain_name
(553-553)domain_name
(558-560)domain_name
(568-570)src/handlers/http/cluster/utils.rs (5)
new
(35-47)new
(61-77)new
(92-110)new
(122-129)check_liveness
(172-192)
⏰ Context from checks skipped due to timeout of 90000ms (10)
- GitHub Check: Build Default aarch64-unknown-linux-gnu
- GitHub Check: Build Default aarch64-apple-darwin
- GitHub Check: Build Default x86_64-pc-windows-msvc
- GitHub Check: Quest Smoke and Load Tests for Standalone deployments
- GitHub Check: Build Kafka x86_64-unknown-linux-gnu
- GitHub Check: Build Kafka aarch64-apple-darwin
- GitHub Check: Build Default x86_64-apple-darwin
- GitHub Check: Quest Smoke and Load Tests for Distributed deployments
- GitHub Check: Build Default x86_64-unknown-linux-gnu
- GitHub Check: coverage
🔇 Additional comments (9)
src/handlers/http/modal/mod.rs (1)
552-575
: Add optional docstrings for clarity.
Your newly introducedMetadata
trait and its implementations are straightforward and correct. For future maintainability, consider adding brief Rust doc comments on each method (e.g./// Returns the domain name for this server
) to clarify usage. Otherwise, the implementation looks good.src/metrics/prom_utils.rs (2)
21-21
: No issues with the new import.
UsingMetadata
in this file is appropriate given the updated generic approach.
223-261
: Sufficient for handling about API requests.
This generic function cleanly uses the trait methods for URL construction and authorization header. Error handling is logical and consistent with the rest of the codebase.src/handlers/http/cluster/mod.rs (6)
21-21
: No concerns with concurrency imports.
Imports fromfutures
are properly used for parallel operations.
55-55
: Trait usage is appropriate.
ReferencingMetadata
alongsideIndexerMetadata
andIngestorMetadata
aligns well with your new generic approach.
544-578
: Concurrency and error mapping look good.
Usingfuture::join
to retrieve both ingestor and indexer metadata concurrently is a clean approach. Errors are correctly surfaced, and combining results into one vector is efficient. Consider whether partial successes should be allowed if one node fails, but that depends on your product requirements.
752-808
: Graceful handling of metrics fetch.
The pattern of returningOk(None)
when a server is down or unreachable is user-friendly and robust. The fallback logging warns about unreachable servers without throwing an immediate error. This approach appears consistent with the rest of the cluster logic.
810-833
: Good approach to accumulate server metrics concurrently.
Your approach to collect partial results while ignoring unreachable servers is appropriate. Errors are properly surfaced in the event of an unexpected failure.
835-876
: Efficient concurrency for cluster-wide metrics.
Fetching ingestor and indexer metrics concurrently and then combining them is well structured. The code is clear and easy to maintain.
add indexer server info in cluster info api response
add indexer server metrics in cluster metrics api response
works only with enterprise build
Summary by CodeRabbit
New Features
node_type
field to provide context about the type of cluster node.Refactor