nexus: make oximeter database pool size configurable. #9367

jmcarp · 2025-11-07T17:07:13Z

We use qorb as the connection pooler for ClickHouse within oximeter, and inherit the default pool max_slots of 16. We may be saturating that pool size for some OxQL workloads, such as the proposed otel receiver, which runs many tiny queries in parallel against ClickHouse. This patch makes the pool size configurable, so that we can adjust it if necessary.

bnaecker · 2025-11-07T18:21:31Z

nexus/src/app/mod.rs

        };
+        let mut timeseries_policy = Policy::default();
+        if let Some(max_slots) = config.pkg.timeseries_db.max_slots {
+            timeseries_policy.max_slots = max_slots;


This is the maximum across all backends. Is that what you want to cap? Or a maximum for each backend?

For wherever this gets plumbed into qorb:

https://docs.rs/qorb/latest/qorb/policy/struct.Policy.html#structfield.max_slots
+
https://docs.rs/qorb/latest/qorb/policy/struct.SetConfig.html#structfield.max_count

Are the two values you can tweak

Here's what I was thinking. I think I have queries queueing up when I run the otel receiver against oximeter, and I wanted to see if increasing the connection pool size might help. I don't have a good mental model of qorb—are there multiple backends when we're managing a connection pool for a single database instance, as we are here? For my particular use case, I'm interested in bumping whichever cap is throttling my queries, but maybe we should add knobs for both caps for generality.

As an aside, is there a simple way to check whether we're saturating either the policy or backend's max connections?

We just recently switched back to single-node ClickHouse, in which case there is one backend and so the total cap on slots and the count per backend are the same. When we go back to multinode, we probably want to configure this on a per-backend basis.

As an aside, is there a simple way to check whether we're saturating either the policy or backend's max connections?

I'd probably use the USDT probes to do this. For example, if there is substantial time between claim-start and claim-done, then the connections are all in use since we're spending time queued. You could also use handle-claimed and handle-returned to estimate the spare capacity in the pool over time.

jmcarp requested a review from bnaecker November 7, 2025 17:07

bnaecker reviewed Nov 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

nexus: make oximeter database pool size configurable. #9367

nexus: make oximeter database pool size configurable. #9367

Uh oh!

jmcarp commented Nov 7, 2025

Uh oh!

bnaecker Nov 7, 2025

Uh oh!

smklein Nov 7, 2025

Uh oh!

jmcarp Nov 13, 2025

Uh oh!

bnaecker Nov 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nexus: make oximeter database pool size configurable. #9367

Are you sure you want to change the base?

nexus: make oximeter database pool size configurable. #9367

Uh oh!

Conversation

jmcarp commented Nov 7, 2025

Uh oh!

bnaecker Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

smklein Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

jmcarp Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

bnaecker Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants