Skip to content

add indexer to cluster info and metrics #1281

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

nikhilsinhaparseable
Copy link
Contributor

@nikhilsinhaparseable nikhilsinhaparseable commented Mar 29, 2025

add indexer server info in cluster info api response
add indexer server metrics in cluster metrics api response

works only with enterprise build

Summary by CodeRabbit

  • New Features

    • Enabled concurrent fetching of node information and metrics for quicker data retrieval.
    • Introduced a metadata abstraction to unify access to properties across different node types.
    • Added a node_type field to provide context about the type of cluster node.
  • Refactor

    • Reworked data fetching and error handling to streamline processing of cluster information and metrics.
    • Enhanced metadata handling in the metrics implementation for improved flexibility.

add indexer server info in cluster info api response
add indexer server metrics in cluster metrics api response

works only with enterprise build
Copy link
Contributor

coderabbitai bot commented Mar 29, 2025

Walkthrough

The changes enhance the fetching of cluster information and metrics by implementing concurrency in the get_cluster_info and fetch_cluster_metrics functions. This allows simultaneous retrieval of ingestor and indexer data. New helper functions are introduced for fetching node-level data, along with improved error handling. A new Metadata trait is created and implemented for existing metadata types, standardizing access to domain and token information. Additionally, the metrics module is updated to utilize this trait, allowing for a more generic approach to processing Prometheus samples and API responses.

Changes

File(s) Change Summary
src/handlers/http/cluster/mod.rs Refactored get_cluster_info and fetch_cluster_metrics to concurrently fetch ingestor and indexer data; added helper functions (fetch_node_info, fetch_nodes_info, fetch_node_metrics, fetch_nodes_metrics) with enhanced error mapping and logging.
src/handlers/http/modal/mod.rs Introduced new Metadata trait with methods domain_name, token, and node_type; implemented it for IngestorMetadata and IndexerMetadata to provide a unified interface for metadata access.
src/metrics/prom_utils.rs Updated from_prometheus_samples and from_about_api_response to use a generic Metadata type instead of a specific type, replacing direct field access with trait method calls for domain and token retrieval.
src/handlers/http/cluster/utils.rs Added a new field node_type to the ClusterInfo struct; updated constructor method to accept node_type as a parameter.

Suggested labels

for next release

Suggested reviewers

  • parmesant

Poem

I'm a bunny in the code, hopping through the lines,
Fetching data concurrently, where efficiency shines.
Metadata traits and metrics now dance in sync,
As errors get handled with just a quick wink.
With every joyful jump and debug on the run,
I nibble on changes 'til the work is done!
🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 819dd98 and e42b4f7.

📒 Files selected for processing (4)
  • src/handlers/http/cluster/mod.rs (4 hunks)
  • src/handlers/http/cluster/utils.rs (3 hunks)
  • src/handlers/http/modal/mod.rs (1 hunks)
  • src/metrics/prom_utils.rs (8 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
src/handlers/http/cluster/utils.rs (1)
src/handlers/http/modal/mod.rs (3)
  • node_type (555-555)
  • node_type (566-568)
  • node_type (579-581)
⏰ Context from checks skipped due to timeout of 90000ms (10)
  • GitHub Check: coverage
  • GitHub Check: Quest Smoke and Load Tests for Standalone deployments
  • GitHub Check: Quest Smoke and Load Tests for Distributed deployments
  • GitHub Check: Build Default x86_64-pc-windows-msvc
  • GitHub Check: Build Default x86_64-unknown-linux-gnu
  • GitHub Check: Build Default aarch64-apple-darwin
  • GitHub Check: Build Default x86_64-apple-darwin
  • GitHub Check: Build Kafka x86_64-unknown-linux-gnu
  • GitHub Check: Build Default aarch64-unknown-linux-gnu
  • GitHub Check: Build Kafka aarch64-apple-darwin
🔇 Additional comments (23)
src/handlers/http/cluster/utils.rs (3)

58-58: Good addition of node type field to ClusterInfo struct

Adding the node_type field to the ClusterInfo struct will enable distinguishing between ingestor and indexer nodes, which is necessary for the enterprise build enhancements mentioned in the PR objectives.


69-69: Appropriate constructor parameter addition

The node_type parameter is correctly added to the new method signature to support the new field in the struct.


78-78: Proper field initialization

The node_type field is properly initialized by converting the string slice parameter to a String using to_string(), which is consistent with the pattern used for other string fields in this constructor.

src/handlers/http/modal/mod.rs (3)

552-556: Well-designed Metadata trait

The new Metadata trait is well-designed with three essential methods for accessing common properties needed for API operations. This abstraction will allow code to work with both ingestor and indexer nodes through a unified interface.


558-569: Clean implementation for IngestorMetadata

The Metadata trait implementation for IngestorMetadata is clean and straightforward, correctly returning references to the existing fields and hard-coding the node type as "ingestor".


571-582: Consistent implementation for IndexerMetadata

The Metadata trait implementation for IndexerMetadata mirrors the implementation for IngestorMetadata, providing a consistent interface. The node type is correctly hard-coded as "indexer".

src/metrics/prom_utils.rs (9)

21-21: Updated import for new Metadata trait

Correctly importing the new Metadata trait for use in this file.


40-40: Added node_type field to Metrics struct

Adding the node_type field to the Metrics struct allows for storing and distinguishing metrics from different node types, which is essential for the enterprise build functionality.


76-77: Default implementation for node_type

The default implementation correctly sets "ingestor" as the default node type, which maintains backward compatibility.


97-101: Updated constructor to support node type

The new method now correctly accepts and initializes the node_type field, allowing for explicit specification of the node type when creating a Metrics instance.


162-169: Generalized method using trait bounds

Excellent refactoring of from_prometheus_samples to use generics with T: Metadata trait bound, allowing it to work with both ingestor and indexer metadata. The code properly uses the trait methods to access the domain name and node type.


216-221: Updated error handling for generic metadata

The error handling has been updated to use more generic terminology ("server info" instead of "ingestor info"), which is appropriate given the function now works with multiple node types.


229-231: Generalized API response handling

The from_about_api_response method has been refactored to use the same generic approach with the Metadata trait, maintaining consistency with the other methods.


234-235: Trait method usage instead of direct field access

The code now correctly uses trait methods (domain_name() and token()) instead of direct field access, making it more flexible and aligned with the trait-based abstraction.

Also applies to: 243-244


261-263: Updated error messages

The error messages have been updated to use more generic terminology ("server" instead of "ingestor"), which is appropriate given the function now works with multiple node types.

Also applies to: 265-267

src/handlers/http/cluster/mod.rs (8)

21-21: Added futures imports for concurrency

Good addition of futures imports to support the new concurrent processing of ingestor and indexer data.


55-55: Updated import to include Metadata trait

The import has been updated to include the new Metadata trait, which is necessary for the generic implementations in this file.


545-578: Improved get_cluster_info with concurrent fetching

Excellent refactoring of get_cluster_info to fetch ingestor and indexer metadata concurrently, which improves performance. The error handling is robust, and the results are properly combined into a single response.


581-638: Well-implemented fetch_node_info helper function

The new fetch_node_info helper function cleanly encapsulates the logic for retrieving information about a single node, using the Metadata trait to work with both ingestor and indexer nodes. The error handling is thorough and the code properly passes the node type to the ClusterInfo constructor.


641-658: Efficient parallel fetch_nodes_info implementation

The fetch_nodes_info function efficiently fetches information for multiple nodes in parallel using Rust's async streams. The implementation is robust with proper error handling and result collection.


754-807: Well-designed fetch_node_metrics helper function

The fetch_node_metrics function follows the same pattern as fetch_node_info, providing a clean implementation for retrieving metrics from a single node with proper error handling and liveness checking.


810-832: Efficient parallel fetch_nodes_metrics implementation

The fetch_nodes_metrics function efficiently processes multiple nodes in parallel, with proper handling of results including cases where metrics couldn't be fetched from some nodes.


835-875: Comprehensive fetch_cluster_metrics refactoring

The fetch_cluster_metrics function has been comprehensively refactored to fetch both ingestor and indexer metrics concurrently and combine them into a single result. The implementation is clean with proper error handling throughout.

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/metrics/prom_utils.rs (1)

159-215: Revise error message to avoid referencing “ingestor info” generically.
When mapping the error at lines 213-214, you still log “failed to get ingestor info.” Since this function is now generic and may be called on any server implementing Metadata, consider updating the message to “failed to get server info” to avoid confusion.

- error!("Fatal: failed to get ingestor info: {:?}", err);
+ error!("Fatal: failed to get server info: {:?}", err);
src/handlers/http/cluster/mod.rs (2)

580-637: Check for potential invalid domain panic.
Currently, a malformed domain in server.domain_name() causes an immediate .expect("should always be a valid url") at line 587. If domain names might ever be user-configured or come from external sources, consider error handling rather than panicking to improve system resilience.


639-657: Parallel fetching is well-designed.
Using stream::iter(...).buffer_unordered(servers_len) achieves concurrency without an explicit limit. If the cluster can grow large, you may want to impose a cap to avoid saturating resources, but that may not be a concern in smaller deployments.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7adcf12 and e1f5c52.

📒 Files selected for processing (3)
  • src/handlers/http/cluster/mod.rs (4 hunks)
  • src/handlers/http/modal/mod.rs (1 hunks)
  • src/metrics/prom_utils.rs (5 hunks)
🧰 Additional context used
🧬 Code Definitions (2)
src/metrics/prom_utils.rs (3)
src/handlers/http/modal/mod.rs (5)
  • new (214-234)
  • new (389-409)
  • domain_name (553-553)
  • domain_name (558-560)
  • domain_name (568-570)
src/parseable/mod.rs (1)
  • new (140-162)
src/metadata.rs (1)
  • new (95-130)
src/handlers/http/cluster/mod.rs (2)
src/handlers/http/modal/mod.rs (5)
  • new (214-234)
  • new (389-409)
  • domain_name (553-553)
  • domain_name (558-560)
  • domain_name (568-570)
src/handlers/http/cluster/utils.rs (5)
  • new (35-47)
  • new (61-77)
  • new (92-110)
  • new (122-129)
  • check_liveness (172-192)
⏰ Context from checks skipped due to timeout of 90000ms (10)
  • GitHub Check: Build Default aarch64-unknown-linux-gnu
  • GitHub Check: Build Default aarch64-apple-darwin
  • GitHub Check: Build Default x86_64-pc-windows-msvc
  • GitHub Check: Quest Smoke and Load Tests for Standalone deployments
  • GitHub Check: Build Kafka x86_64-unknown-linux-gnu
  • GitHub Check: Build Kafka aarch64-apple-darwin
  • GitHub Check: Build Default x86_64-apple-darwin
  • GitHub Check: Quest Smoke and Load Tests for Distributed deployments
  • GitHub Check: Build Default x86_64-unknown-linux-gnu
  • GitHub Check: coverage
🔇 Additional comments (9)
src/handlers/http/modal/mod.rs (1)

552-575: Add optional docstrings for clarity.
Your newly introduced Metadata trait and its implementations are straightforward and correct. For future maintainability, consider adding brief Rust doc comments on each method (e.g. /// Returns the domain name for this server) to clarify usage. Otherwise, the implementation looks good.

src/metrics/prom_utils.rs (2)

21-21: No issues with the new import.
Using Metadata in this file is appropriate given the updated generic approach.


223-261: Sufficient for handling about API requests.
This generic function cleanly uses the trait methods for URL construction and authorization header. Error handling is logical and consistent with the rest of the codebase.

src/handlers/http/cluster/mod.rs (6)

21-21: No concerns with concurrency imports.
Imports from futures are properly used for parallel operations.


55-55: Trait usage is appropriate.
Referencing Metadata alongside IndexerMetadata and IngestorMetadata aligns well with your new generic approach.


544-578: Concurrency and error mapping look good.
Using future::join to retrieve both ingestor and indexer metadata concurrently is a clean approach. Errors are correctly surfaced, and combining results into one vector is efficient. Consider whether partial successes should be allowed if one node fails, but that depends on your product requirements.


752-808: Graceful handling of metrics fetch.
The pattern of returning Ok(None) when a server is down or unreachable is user-friendly and robust. The fallback logging warns about unreachable servers without throwing an immediate error. This approach appears consistent with the rest of the cluster logic.


810-833: Good approach to accumulate server metrics concurrently.
Your approach to collect partial results while ignoring unreachable servers is appropriate. Errors are properly surfaced in the event of an unexpected failure.


835-876: Efficient concurrency for cluster-wide metrics.
Fetching ingestor and indexer metrics concurrently and then combining them is well structured. The code is clear and easy to maintain.

coderabbitai[bot]
coderabbitai bot previously approved these changes Mar 29, 2025
coderabbitai[bot]
coderabbitai bot previously approved these changes Mar 29, 2025
@nitisht nitisht merged commit 382f480 into parseablehq:main Mar 31, 2025
14 of 16 checks passed
@nikhilsinhaparseable nikhilsinhaparseable deleted the indexer-cluster-metrics branch March 31, 2025 07:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants