Skip to content

Allow passing a custom random number generator to trace.RandomIdGenerator #4571

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

agcom
Copy link

@agcom agcom commented May 7, 2025

Description

I am using OpenTelemetry to trace gRPC calls in my distributed machine learning application, and I fixed the global random seed in all replicas via random.seed(42) to make the application's result reproducible; but, it causes duplicate trace and span ids rendering the traces corrupt.

This PR adds the possibility to create/use an instance of trace.RandomIdGenerator that utilizes a custom random number generator instance (e.g. random.Random()); It preserves backward compatibility and previous users expectations by defaulting to the global random instance as before.

Fixes #4376.

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

  • Ran the existing unit tests.

To reproduce my use-case, write a Python script with the following pseudo-algorithm:

  • Fix the global random seed.
  • Instrument the gRPC server and client (using the default id generator), exporting to an OpenTelemetry Collector instance which in-turn optionally exports them to a Jaeger instance.
  • Spin up a gRPC server with a simple NOP function.
  • Every x seconds, randomly choose a peer (assume running multiple instances of the script with pre-determined addresses) and call its function.
  • Run multiple instances of the script.
  • Find the duplicate ids in the output traces (can be done through the Jaeger UI which visibly warns about duplicate ids).

Does This PR Require a Contrib Repo Change?

  • Yes.
  • No.

Checklist:

  • Followed the style guidelines of this project.
  • Change-logs have been updated.
  • Unit tests have been added.
  • Documentation has been updated.

Copy link

linux-foundation-easycla bot commented May 7, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@agcom agcom changed the title Allow passing a custom random number generator to RandomIdGenerator Allow passing a custom random number generator to trace.RandomIdGenerator May 7, 2025
@agcom agcom marked this pull request as ready for review May 7, 2025 20:09
@agcom agcom requested a review from a team as a code owner May 7, 2025 20:09
@agcom
Copy link
Author

agcom commented May 7, 2025

Ideally, the default RNG should be a fresh Random() (which picks a high-entropy seed through combining /dev/urandom, the current time, and the process identifier) and not the global random instance; but, that would be a breaking change/update, affecting the previous codes and users expectations relying on that behavior (unlikely, but yet possible).

@agcom
Copy link
Author

agcom commented May 7, 2025

I am not sure of what unit tests to add if they are mandatory for this change. In the tests/trace/test_trace.py file, there are lines where an instance of RandomIdGenerator is created and used to generate a few ids; I could change them all to RandomIdGenerator(Random()).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

When the random seed is set, it causes duplicate traceId and spanId.
2 participants