chore: add is_container_env to telemetry MCP-2 #298

fmenezes · 2025-06-13T14:32:09Z

is_container_env should be true / false

Copilot

Pull Request Overview

This PR adds support for detecting whether the application is running in a containerized environment by introducing the is_container_env telemetry property. Key changes include updating telemetry types, modifying the Telemetry class to asynchronously retrieve container environment status via file checks and environment variables, and updating tests to properly await asynchronous telemetry common properties.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
tests/unit/telemetry.test.ts	Updated tests to await asynchronous verification of common properties and buffering state.
tests/integration/telemetry.test.ts	Adjusted telemetry creation to include the new getContainerEnv and updated asynchronous checks.
src/telemetry/types.ts	Added is_container_env to the Telemetry CommonProperties type.
src/telemetry/telemetry.ts	Modified Telemetry to support container environment detection and asynchronous common properties.

Copilot · 2025-06-13T18:04:52Z

src/telemetry/telemetry.ts

+    public async getAsyncCommonProperties(): Promise<CommonProperties> {
+        return {
+            ...this.getCommonProperties(),
+            is_container_env: (await this.containerEnvPromise) ? "true" : "false",


[nitpick] Consider returning a boolean for is_container_env in getAsyncCommonProperties (e.g., true/false) instead of a string to more directly reflect its boolean nature, if this change is compatible with TelemetryBoolSet.

Suggested change

is_container_env: (await this.containerEnvPromise) ? "true" : "false",

is_container_env: await this.containerEnvPromise,

not compatible TelemetryBoolSet expects strings "true" or "false"

src/telemetry/telemetry.ts

nirinchev · 2025-06-13T21:39:32Z

src/telemetry/telemetry.ts

+    for (const file of ["/.dockerenv", "/run/.containerenv", "/var/run/.containerenv"]) {
+        const exists = await fileExists(file);
+        if (exists) {
+            return true;
+        }
+    }
+    return !!process.env.container;


This is fairly minor as this is clearly not on a hot path, but would it make sense to first check for the env variable since that is cheapest? And then for the files, we can check them in parallel rather than 1 by 1.

nirinchev · 2025-06-13T21:49:14Z

src/telemetry/telemetry.ts

    /** Resolves when the device ID is retrieved or timeout occurs */
+    private bufferingEvents: number = 2;


This name no longer makes much sense. How about pendingPropertyPromises or something similar?

nirinchev · 2025-06-13T22:08:56Z

src/telemetry/telemetry.ts

@@ -117,6 +163,14 @@ export class Telemetry {
        };
    }

+    public async getAsyncCommonProperties(): Promise<CommonProperties> {


I don't have super strong feelings here, but I'm not a huge fan of making methods public just for the sake of tests. While I agree it's convenient to test the internals here, this and isBufferingEvents are implementation details, so they shouldn't be made public.

src/telemetry/telemetry.ts

gagik · 2025-06-16T12:21:41Z

src/telemetry/telemetry.ts

+    if (process.env.container) {
+        return true;
+    }
+    for (const file of ["/.dockerenv", "/run/.containerenv", "/var/run/.containerenv"]) {


do the / paths work on windows?

gagik · 2025-06-16T12:22:47Z

src/telemetry/telemetry.ts

        return instance;
    }

-    private async start(): Promise<void> {
+    private start(): void {


it's usually always cleaner to use async function then wrap them in finally-s etc. I'd revert this.

idea here was to start the promises and not wait for them to finish

what would be the benefit of that? we're already spawning an async function without waiting for it to finish (with void above) so this isn't blocking the main thread.

We'd want to buffer events beforehand so we can make sure telemetry stuff we send is with device_id resolved if possible

gagik · 2025-06-16T12:23:14Z

src/telemetry/telemetry.ts

        if (!this.isTelemetryEnabled()) {
            return;
        }
+
        this.deviceIdPromise = getDeviceId({


you are probably looking for Promise.all (or Promise.allSettled)?

gagik · 2025-06-16T12:23:46Z

src/telemetry/telemetry.ts

+    private deviceIdPromise: Promise<string> | undefined;
+    private containerEnvPromise: Promise<boolean> | undefined;


We can probably turn this into a single setup promise with Promise.all or something, see below:

gagik

I think we can manage the multiple promises better. Likely having 1 setup promise that combines the other 1 promises and just awaiting those instead of having a counter. Though maybe I'm missing some bit of the implementation here.

fmenezes · 2025-06-16T12:31:36Z

I think we can manage the multiple promises better. Likely having 1 setup promise that combines the other 1 promises and just awaiting those instead of having a counter. Though maybe I'm missing some bit of the implementation here.

I did not know either it was prior to my time, but just checked the code it should be ok just to wait on both promises, just be instant

gagik · 2025-06-16T12:41:57Z

should be ok just to wait on both promises

yeah we can keep the async start + buffering like before and just add the second await into it with Promise.all before setting isBuffering to false

gagik · 2025-06-16T13:46:16Z

src/telemetry/telemetry.ts

            mcp_client_version: this.session.agentRunner?.version,
            mcp_client_name: this.session.agentRunner?.name,
            session_id: this.session.sessionId,
            config_atlas_auth: this.session.apiClient.hasCredentials() ? "true" : "false",
            config_connection_string: this.userConfig.connectionString ? "true" : "false",
+            is_container_env: (await this.containerEnvPromise) ? "true" : "false",
+            device_id: await this.deviceIdPromise,


I don't think we should be awaiting here; an explicit unawaited + buffering with an async start like it was before is easier to reason about, it just needs another to await for container inside start

Otherwise there's potentially hundreds of floating functions awaiting device_id or whatever else. We can assume in worst case scenario this isn't instant (and could never even resolve)

on the device ID there is a timeout, on the container env it is just fs/promises so we can be sure they will resolve. These promises are private, I think the buferring logic only makes sense if we are tagging on finally or something, to me feels unneeded complication. Our current timeout is 3 seconds, no reason to be extra defensive.

it's fair that we don't need to be extra defensive but either way we're still essentially going to end up buffering these events.

The difference is that with awaiting here the "buffering" would happen through a bunch of async functions in parallel in memory stack waiting on these promises as opposed to more explicit buffering where we store the functions we want to execute ourselves and execute them in order after resolving these promises in one place.

With awaiting in common properties approach we'd, for example, not have any guarantees that these functions end up resolving in the same order. Which we don't need to care about, but it's nice to have better idea of how async execution is going to happen and have 1 point of logic where we resolve all the async issues.

With discussions with @nirinchev, we also want to create a shared telemetry service component across devtools so we would not want to deviate too much unless there are good reasons for it (this setup is modeled after mongosh and Compass telemetry).

gagik · 2025-06-16T13:48:11Z

src/telemetry/telemetry.ts

-        this.commonProperties.device_id = await this.deviceIdPromise;
-
-        this.isBufferingEvents = false;
+        this.containerEnvPromise = this.getContainerEnv();


Suggested change

this.containerEnvPromise = this.getContainerEnv();

const [deviceId, isContainerEnv] = await Promise.all([this.deviceIdPromise, this.getContainerEnvPromise]);

this.commonProperties.device_id = deviceId;

this.commonProperties.is_container_env = isContainerEnv;

something like this

with this approach start would take up to 3s which I think it's acceptable, will change

it will have identical behavior to awaiting inside the getCommonProperties, we do not await the start() function anywhere so it's essentially same as declaring a combined promise, we instantly initialize the telemetry and start buffering event when emit is run.

In both cases we'd instantly be able to emit events, and in both cases if device ID takes 3 seconds, the eventual emission of events will take 3 seconds.

But with start there's only 1 point where we deal with asynchronous logic, and in the other case we have a bunch of less predictable asyncronous functions listening in into 2 different promises.

gagik · 2025-06-16T20:24:30Z

src/telemetry/telemetry.ts


-        void instance.start();
+        await instance.start();


Suggested change

await instance.start();

void instance.start();

This was an important detail in the existing implementation: start was not getting awaited before. This is why the behavior is equivalent to what was trying to be accomplished before with skipping the promise. Telemetry was already not waiting for the device ID resolution.

So the create function does not need to be asynchronous and can be left as is.

3s is nothing to wait, I think the code looks much better to wait on promises instead of "ignoring" them, I can move to last

adding potentially 3 seconds to the MCP server startup time for no real functional benefit is absolutely non-trivial. We shouldn't be slowing down anything in our logic for the sake of telemetry. This is also not ignoring a promise, this is explicitly spawning a parallel async operation here and the void keyword exists for that reason.

If you want, you can separate the refactor bits of this PR into its own thing and we can have a larger discussion about those changes but in places like mongosh we absolutely do care about startup time so when consolidating telemetry we'd likely end up with the original structure eventually.

I see no reason to think 3 seconds is negligible, but I'm changing the logic to send wait for 3 seconds asynchronously not blocking main flow

I'm changing the logic to send wait for 3 seconds asynchronously not blocking main flow

not sure what you mean

gagik · 2025-06-16T20:27:50Z

src/telemetry/telemetry.ts

            mcp_client_version: this.session.agentRunner?.version,
            mcp_client_name: this.session.agentRunner?.name,
            session_id: this.session.sessionId,
            config_atlas_auth: this.session.apiClient.hasCredentials() ? "true" : "false",
            config_connection_string: this.userConfig.connectionString ? "true" : "false",
+            is_container_env: (await this.containerEnvPromise) ? "true" : "false",
+            device_id: await this.deviceIdPromise,


it's fair that we don't need to be extra defensive but either way we're still essentially going to end up buffering these events.

The difference is that with awaiting here the "buffering" would happen through a bunch of async functions in parallel in memory stack waiting on these promises as opposed to more explicit buffering where we store the functions we want to execute ourselves and execute them in order after resolving these promises in one place.

With awaiting in common properties approach we'd, for example, not have any guarantees that these functions end up resolving in the same order. Which we don't need to care about, but it's nice to have better idea of how async execution is going to happen and have 1 point of logic where we resolve all the async issues.

With discussions with @nirinchev, we also want to create a shared telemetry service component across devtools so we would not want to deviate too much unless there are good reasons for it (this setup is modeled after mongosh and Compass telemetry).

gagik · 2025-06-16T20:32:58Z

src/telemetry/telemetry.ts

+    private deviceId: string | undefined;
+    private containerEnv: boolean | undefined;


why not have them inside commonProperties so it easier to see which fields in the state are used solely for common properties?

commonProperties is a function not a member

good point, maybe then we can scope it into machineMetadata or machineInfo here? so we can mix the things we have from the global constant and things we resolve later on in initialization.

or it can be just these 2 and then we can expand that and the constant, that works too.
We might have more fields like this in the future and we wouldn't want to keep adding fields in root level.

gagik · 2025-06-16T20:33:50Z

src/index.ts

@@ -20,7 +20,7 @@ try {
        version: packageInfo.version,
    });

-    const telemetry = Telemetry.create(session, config);
+    const telemetry = await Telemetry.create(session, config);


Suggested change

const telemetry = await Telemetry.create(session, config);

const telemetry = Telemetry.create(session, config);

the creation logic does not be turned async.

I'd say the main change needed is to add this.commonProperties.is_container_env intialization into start (possibly rename it to setup as that naming seems confusing)

gagik · 2025-06-17T10:43:36Z

tests/unit/telemetry.test.ts


-                    await telemetry.deviceIdPromise;
+                    await delay(5000); // Wait for timeout


this test will now take 5 seconds longer. if we use jest.advanceTimersByTime we don't have to actually wait 5 seconds.

gagik

Sorry for a let of back and forth on this, I see this as a very valuable discussion on how we'd want to structure devtools' telemetry in general so feel quite strongly about getting that right and especially if those change deviate from the norm so far (the original structure matches that of Compass + mongosh and has been through technical design reviews). Thanks for the work with that.

I'm still very much in favor of the original structure as it is easier to track and define what state the telemetry is in.

gagik · 2025-06-19T13:29:11Z

src/telemetry/telemetry.ts

-            config_connection_string: this.userConfig.connectionString ? "true" : "false",
-        };
+    private async getCommonProperties(): Promise<CommonProperties> {
+        if (!this.cachedCommonProperties) {


This is still basically identical to isBufferingEvents as we'd be indirectly buffering by spawning async operations and awaiting them.

I can see some value in the idea that we'd only start trying to resolve the device ID etc. stuff when there's an emit emission but this will be problematic and harder to reason about:

if there's 2 emitEvents one after another, we'd have 2 processes at the same time trying to set to cachedCommonProperties.

I still think the responsibility of a 1-time resolution of machine should be inside the setup function and not emit or getCommonProperties

getCommonProperties happens only once and resolves during the first flush, I've seen this approach many times on backends before, we can rename getCommonProperties into setup does it make sense to you?

getCommonProperties happens only once

but getCommonProperties gets called by emitEvents every time right?

First emitEvents call would begin the async process but it wouldn't immediately set this.cachedCommonProperties.

For the next emitEvents call, !this.cachedCommonProperties is still true and it'd run the async functions afterwards again.

We could check for this.cachedCommonProperties == undefined; and set this.cachedCommonProperties = {}; before doing async work to prevent this.

then we'd prevent the async clash

we can rename getCommonProperties to setup

yeah, that actually does make it much better in my view. then I do see some of the general value of starting this only when an emission happens, if that is the intended improvement.

gagik · 2025-06-19T13:32:49Z

tests/unit/telemetry.test.ts

@@ -16,6 +16,9 @@ const MockApiClient = ApiClient as jest.MockedClass<typeof ApiClient>;
 jest.mock("../../src/telemetry/eventCache.js");
 const MockEventCache = EventCache as jest.MockedClass<typeof EventCache>;

+const nextTick = () => new Promise((resolve) => process.nextTick(resolve));


I have never seen nextTick used in tests before, is there a difference between this and not doing anything?

Yes, the reason why this is needed is so once we emit an event the flushing process makes a fetch to api and that does not resolve immediately.

got it, could we add a test where there's 2 or more telemetry.emitEvents calls one after the other and check that all works as expected?

src/telemetry/telemetry.ts

gagik · 2025-06-19T13:37:10Z

src/telemetry/telemetry.ts

+            return;
+        }
+
+        if (this.flushing) {


isn't this identical to buffering? could we structure it as buffering?

I disagree with buffering as a terminology here, if we are emptying the cache after sending events over network it means we are no longer accumulating but rather releasing resources.

The only term I've seen used to resonate with that purpose is flush, similar to flush logs after a buffer is too big for instance.

we are no longer accumulating but rather releasing resources

Well, we are releasing already buffered resources but we're buffering all future resources. The state of the boolean to me in usage is more relevant generally to the future resources rather than the past resources, i.e. we use it largely to determine whether or not to this.eventCache.appendEvents(events ?? []);. But I dont have as strong feelings about this, can also be isFlushing

We do actually have a concept of flush along with buffering in https://github.com/mongodb-js/mongosh/blob/main/packages/logging/src/logging-and-telemetry.ts#L129

gagik

I don't want to drag this on longer as I am OOO for next week, thank you for all the iterations with this, so will remove the request changes as not to block this getting out.

I think we're quite aligned now, the only distinction is the preference of whether we should setup telemetry right after intialization or after the first emitEvents call. I still believe in the former but this I can mark as subjective and not something I have as strong feelings about.

I only have a concern about how this would work with multiple parallel emitEvents calls so maybe some tests there would be good.

gagik · 2025-06-19T15:32:13Z

src/telemetry/telemetry.ts

+        if (!this.cachedCommonProperties) {
+            let deviceId: string | undefined;
+            try {
+                deviceId = await getDeviceId({


nit, feel free to ignore: it'd still be good to do a Promise.all so we can initialize things in parallel. I know this is messier structure but would let us be much more

This reverts commit 96c8f62.

fmenezes added 7 commits June 13, 2025 15:31

MCP-2: add is_container_env to telemetry

a480c94

fix: tests

126994c

fix: tests

bf204bb

fix: tests

3d91b40

fix: tests

fe64649

fix: cleanup

f73c208

fix: check

671bbeb

fmenezes marked this pull request as ready for review June 13, 2025 18:01

Copilot AI review requested due to automatic review settings June 13, 2025 18:01

fmenezes requested a review from a team as a code owner June 13, 2025 18:01

fmenezes changed the title ~~MCP-2: add is_container_env to telemetry~~ chore: [MCP-2] add is_container_env to telemetry Jun 13, 2025

This comment was marked as outdated.

Sign in to view

fix: cleanup

c2a1246

fmenezes requested a review from Copilot June 13, 2025 18:04

Copilot AI reviewed Jun 13, 2025

View reviewed changes

himanshusinghs reviewed Jun 13, 2025

View reviewed changes

src/telemetry/telemetry.ts Outdated Show resolved Hide resolved

nirinchev reviewed Jun 13, 2025

View reviewed changes

fmenezes added 4 commits June 16, 2025 11:06

fix: address comments

678beee

fix: address comments

8d53eea

fix: merge getCommonProperties and getAsyncCommonProperties

95b0b06

fix: tests

911fdcb

gagik reviewed Jun 16, 2025

View reviewed changes

src/telemetry/telemetry.ts Outdated Show resolved Hide resolved

fix: parallel promises

4703628

gagik reviewed Jun 16, 2025

View reviewed changes

gagik requested changes Jun 16, 2025

View reviewed changes

fix: remove counters

931816f

fmenezes requested a review from gagik June 16, 2025 13:36

gagik reviewed Jun 16, 2025

View reviewed changes

fmenezes added 2 commits June 16, 2025 17:38

fix: address comments

0b60fe2

fix: address comments

6ba2346

fmenezes requested a review from gagik June 16, 2025 16:40

gagik reviewed Jun 16, 2025

View reviewed changes

gagik reviewed Jun 17, 2025

View reviewed changes

move async into closer to sending

f4bfbf0

fmenezes requested a review from gagik June 19, 2025 12:41

fmenezes added 5 commits June 19, 2025 13:42

fix

3424e0f

fix close

83fd1d9

fix test

7e3cdb1

fix test

c67877e

fix lint

814ccb9

gagik reviewed Jun 19, 2025

View reviewed changes

gagik approved these changes Jun 19, 2025

View reviewed changes

gagik changed the title ~~chore: [MCP-2] add is_container_env to telemetry~~ chore: add is_container_env to telemetry MCP-2 Jun 19, 2025

fmenezes added 2 commits June 19, 2025 18:37

parallel and test

1dba800

fix: lint

a0c208c

fmenezes enabled auto-merge (squash) June 19, 2025 17:39

fmenezes merged commit 96c8f62 into main Jun 19, 2025
18 checks passed

fmenezes deleted the MCP-2 branch June 19, 2025 17:46

fmenezes added a commit that referenced this pull request Jun 30, 2025

Revert "chore: add is_container_env to telemetry MCP-2 (#298)"

3241c4a

This reverts commit 96c8f62.

This was referenced Jun 30, 2025

revert: rollback "chore: add is_container_env to telemetry MCP-2 #330

Merged

chore: reinstate telemetry/docker change after revert MCP-49 #339

Merged

	is_container_env: (await this.containerEnvPromise) ? "true" : "false",
	is_container_env: await this.containerEnvPromise,

		/** Resolves when the device ID is retrieved or timeout occurs */
		private bufferingEvents: number = 2;

		private deviceIdPromise: Promise<string> \| undefined;
		private containerEnvPromise: Promise<boolean> \| undefined;

-        this.containerEnvPromise = this.getContainerEnv();
+        const [deviceId, isContainerEnv] = await Promise.all([this.deviceIdPromise, this.getContainerEnvPromise]);
+        this.commonProperties.device_id = deviceId;
+        this.commonProperties.is_container_env = isContainerEnv;

		private deviceId: string \| undefined;
		private containerEnv: boolean \| undefined;

	const telemetry = await Telemetry.create(session, config);
	const telemetry = Telemetry.create(session, config);


		await telemetry.deviceIdPromise;
		await delay(5000); // Wait for timeout

chore: add is_container_env to telemetry MCP-2 #298

chore: add is_container_env to telemetry MCP-2 #298

Uh oh!

Conversation

fmenezes commented Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jun 13, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gagik Jun 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gagik left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fmenezes commented Jun 16, 2025

Uh oh!

gagik commented Jun 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gagik Jun 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gagik Jun 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gagik Jun 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gagik Jun 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gagik Jun 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fmenezes commented Jun 13, 2025 •

edited

Loading

gagik Jun 16, 2025 •

edited

Loading

gagik left a comment •

edited

Loading

gagik commented Jun 16, 2025 •

edited

Loading

gagik Jun 16, 2025 •

edited

Loading

gagik Jun 16, 2025 •

edited

Loading

gagik Jun 16, 2025 •

edited

Loading

gagik Jun 16, 2025 •

edited

Loading

gagik Jun 16, 2025 •

edited

Loading

gagik Jun 16, 2025 •

edited

Loading

gagik Jun 16, 2025 •

edited

Loading

gagik Jun 19, 2025 •

edited

Loading

gagik left a comment •

edited

Loading

fmenezes Jun 19, 2025 •

edited

Loading

gagik Jun 19, 2025 •

edited

Loading

gagik Jun 19, 2025 •

edited

Loading