botocore: add basic tracing for bedrock InvokeAgent #3404

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

stanek-michal wants to merge 3 commits into open-telemetry:main from stanek-michal:invoke-agent

stanek-michal commented Apr 3, 2025

Description

Add basic support for InvokeAgent API. For now only returnControl function invocation, non-streaming mode. Assumes the agent was created beforehand in Bedrock platform - proper Agent creation/deletion for testcases will be added in another PR.

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

tox -e py310-test-instrumentation-botocore

Does This PR Require a Core Repo Change?

Yes. - Link to PR:
No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

Followed the style guidelines of this project
[] Changelogs have been updated
Unit tests have been added
Documentation has been updated

linux-foundation-easycla bot commented Apr 3, 2025 •

edited

Loading

The committers listed above are authorized under a signed CLA.

✅ login: stanek-michal / name: Michal Stanek (4c4c419, 360d048, 01521d5)

stanek-michal force-pushed the invoke-agent branch from 3ca64a2 to a193a11 Compare

April 3, 2025 03:56

xrmx reviewed

View reviewed changes

Contributor

xrmx left a comment •

edited

Loading

Haven't looked at the actual implementation yet, in the meantime a few random comments. Also it looks like you don't have pre-commit setup, see CONTRIBUTING.

instrumentation/opentelemetry-instrumentation-botocore/test-requirements-0.txt Outdated Show resolved Hide resolved

instrumentation/opentelemetry-instrumentation-botocore/test-requirements-1.txt Outdated Show resolved Hide resolved

instrumentation/opentelemetry-instrumentation-botocore/tests/bedrock_utils.py Outdated Show resolved Hide resolved

instrumentation/opentelemetry-instrumentation-botocore/tests/test_botocore_bedrock.py Outdated

+              def test_invoke_agent_with_completion(
+                  span_exporter,
+                  metric_reader,
+                  bedrock_runtime_client,

Contributor

xrmx Apr 3, 2025

Do you still need the bedrock runtime client?

Author

stanek-michal Apr 17, 2025

no, though the create_agent scaffolding will need it still (for s3 and such) but it will be done elsewhere

instrumentation/opentelemetry-instrumentation-botocore/tests/test_botocore_bedrock.py Outdated Show resolved Hide resolved

...y-instrumentation-botocore/src/opentelemetry/instrumentation/botocore/extensions/__init__.py Outdated Show resolved Hide resolved

...on/opentelemetry-instrumentation-botocore/examples/bedrock-runtime/zero-code/invoke_agent.py Outdated

+                      self.client = boto3.client('bedrock-agent-runtime')
+                      # Replace these with your actual agent IDs
+                      self.agent_id = "TFMZVIWXR7"

Contributor

xrmx Apr 3, 2025

Let's use env vars then

Author

stanek-michal Apr 17, 2025

changed it on both standalone demo and tests
(so tests now need export TOX_OVERRIDE=testenv.pass_env=AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_SESSION_TOKEN,AWS_DEFAULT_REGION,BEDROCK_AGENT_ID,BEDROCK_AGENT_ALIAS_ID

Contributor

xrmx commented Apr 3, 2025

@stanek-michal please sign the CLA!

estolfo reviewed

View reviewed changes

...ry-instrumentation-botocore/src/opentelemetry/instrumentation/botocore/extensions/bedrock.py

+                              event_stream = result["completion"]
+                              # Drain the stream so we can instrument AND keep events
+                              all_events = list(event_stream)

estolfo Apr 7, 2025

Is there a reason we don't want to follow the same approach as we do with InvokeModelWithResponseStream?
It looks like we use a response stream wrapper that is an iterator.
I wonder what memory implications draining the stream all at once would have?

Author

stanek-michal Apr 17, 2025

I thought it was simpler but you're right, apart of memory use I'm worried about introducing additional latency. For enableTrace=False there is just one event in my tests, but for enabled tracing there can be quite a few of them, so it matters. I'll change it to a wrapper but not sure if in this PR or next one with create agent

...ry-instrumentation-botocore/src/opentelemetry/instrumentation/botocore/extensions/bedrock.py

+                              all_events = list(event_stream)
+                              # A replay generator so user code can still iterate
+                              result["completion"] = _replay_events(all_events)

estolfo Apr 7, 2025

if we use a wrapper that's an iterator, as mentioned above, I'm assuming we wouldn't need this line

xrmx reviewed

View reviewed changes

...ry-instrumentation-botocore/src/opentelemetry/instrumentation/botocore/extensions/bedrock.py Outdated Show resolved Hide resolved

stanek-michal added 2 commits

April 17, 2025 13:20


          Add initial version of invoke_agent tracing instrumentation

360d048


          Add standalone invoke_agent instrumentation demo

01521d5

stanek-michal force-pushed the invoke-agent branch from a193a11 to f4e876d Compare

April 17, 2025 11:20


          Address review comments

4c4c419

stanek-michal force-pushed the invoke-agent branch from f4e876d to 4c4c419 Compare

April 17, 2025 11:24

xrmx reviewed

View reviewed changes

...ry-instrumentation-botocore/src/opentelemetry/instrumentation/botocore/extensions/bedrock.py

-                              span.update_name(f"{operation_name} {request_model}")
+                          if self._call_context.operation == "InvokeAgent":
+                              if operation_name:
+                                  span.update_name(f"{operation_name}")

Contributor

xrmx Apr 28, 2025

 The **span name** SHOULD be `invoke_agent {gen_ai.agent.name}` if `gen_ai.agent.name` is readily available.

Do we have the agent name at hand?

Contributor

SrdjanLL Apr 28, 2025

It looks like InvokeAgent doesn't specify the agent name in request or response, just agentAlias. Seems like a mandatory step to set up an alias in order to call InvokeAgent API (according to my understanding of 1 and 2):

To deploy your agent, you must create an alias. During alias creation, Amazon Bedrock creates a version of your agent automatically. The alias points to this newly created version. Alternatively, you can point the alias to a previously created version of your agent. Then, you configure your application to make API calls to that alias.

Since it's a required request param, I think we can use it here.

...ry-instrumentation-botocore/src/opentelemetry/instrumentation/botocore/extensions/bedrock.py

+                                  api_input = input_item["apiInvocationInput"]
+                                  action_group = api_input.get("actionGroup")
+                                  span.set_attribute(GEN_AI_TOOL_NAME, action_group)
+                                  span.set_attribute(GEN_AI_TOOL_TYPE, "extension")

Contributor

xrmx Apr 28, 2025

Is this called directly by the agent though? From the examples here it looks like these are still called from client side https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_InvokeAgent.html#API_agent-runtime_InvokeAgent_ResponseSyntax

Semconv:

      Extension: A tool executed on the agent-side to directly call external APIs, bridging the gap between the agent and real-world systems.
        Agent-side operations involve actions that are performed by the agent on the server or within the agent's controlled environment.

Author

stanek-michal Apr 30, 2025

apiInvocation is untested for this PR but that's correct - if there is returnControl that means client executes the tool, not the Agent, so in this case it should be function not extension (but overall Agent has the capability to call lambdas on its own if it's configured for it)

SrdjanLL reviewed

View reviewed changes

...ry-instrumentation-botocore/src/opentelemetry/instrumentation/botocore/extensions/bedrock.py

-                              span.update_name(f"{operation_name} {request_model}")
+                          if self._call_context.operation == "InvokeAgent":
+                              if operation_name:
+                                  span.update_name(f"{operation_name}")

Contributor

SrdjanLL Apr 28, 2025

It looks like InvokeAgent doesn't specify the agent name in request or response, just agentAlias. Seems like a mandatory step to set up an alias in order to call InvokeAgent API (according to my understanding of 1 and 2):

To deploy your agent, you must create an alias. During alias creation, Amazon Bedrock creates a version of your agent automatically. The alias points to this newly created version. Alternatively, you can point the alias to a previously created version of your agent. Then, you configure your application to make API calls to that alias.

Since it's a required request param, I think we can use it here.

...y-instrumentation-botocore/src/opentelemetry/instrumentation/botocore/extensions/__init__.py

		@@ -32,6 +32,9 @@ def loader():


		_KNOWN_EXTENSIONS = {
		"bedrock-agent-runtime": _lazy_load(

Contributor

SrdjanLL Apr 28, 2025

Proposal that I don't think is needed for this PR, but more of a future suggestion (and up for discussion):

It might be worth building a separate extension for bedrock-agent-runtime. So far, there seems to be a separate extension for every distinct botocore client (dynamodb, lambda, etc.) and bedrock-runtime is separate from bedrock-agent-runtime. Perhaps it's worth following that pattern and avoid some of the branching logic that's likely to grow as more Bedrock APIs get added.

Furthermore, to introduce agent creation/deletion means adding bedrock-agent client as well, which may again result in another extension or combining it with bedrock-agent-runtime and having one dedicated just to agents.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet