feat: (Agent): Finalize Bidirectional Agent class #12

mehtarac · 2025-10-24T14:59:41Z

Summary

This PR introduces a clean agent-driven architecture for bidirectional streaming, transforming complex manual PyAudio coordination into simple, reusable patterns. The architecture separates core agent logic from hardware IO channels, with the agent as the primary interface managing IO channel lifecycle and enabling extensible multi-modal capabilities.

BidirectionalIO Interface:

Protocol:

class BidirectionalIO:
    async def input_channel(self) -> dict
        # Read input data from the IO channel source
    
    async def output_channel(self, event: dict) -> None
        # Process output event from the model through the IO channel
        
    def cleanup(self) -> None
        # Clean up IO channel resources

Audio Implementation:

class AudioIO(BidirectionalIO):
    def __init__(audio_config: Optional[dict] = None) -> None
        # Configuration via dict: sample_rates, chunk_size, devices, channels
    
    async def input_channel(self) -> dict
        # Read audio from microphone
    
    async def output_channel(self, event: dict) -> None
        # Handle audio events with direct stream writing
        
    def cleanup(self) -> None
        # Clean up audio resources

Purpose:

Pure hardware abstraction layer - bridges PyAudio (microphone/speakers) ↔ BidirectionalAgent

Before: (200+ lines)

# Required 4 separate async functions with shared context
async def play(context): # ~60 lines - PyAudio speaker + queue processing  
async def record(context): # ~30 lines - PyAudio microphone setup
async def receive(agent, context): # ~40 lines - Event handling
async def send(agent, context): # ~20 lines - Audio sending

# Complex manual coordination
context = {"active": True, "audio_in": Queue(), "audio_out": Queue(), "interrupted": False}
await asyncio.gather(play(context), record(context), receive(agent, context), send(agent, context))

After: (2-3 lines)

Primary Pattern: Agent-Driven with IO Channels (Recommended)

# IO channels passed to run method
audio = AudioIO(audio_config={"input_sample_rate": 16000})
agent = BidirectionalAgent(model=model, tools=[calculator])
await agent.run(io_channels=[audio])  # IO channels passed to run method

Context Manager Pattern: Guaranteed Cleanup

# Automatic resource management with context managers
audio = AudioIO(audio_config={"input_sample_rate": 16000})
async with BidirectionalAgent(model=model, tools=[calculator]) as agent:
    await agent.run(io_channels=[audio])

Custom Transport Pattern: Maximum Flexibility

# Direct transport control when needed
agent = BidirectionalAgent(model=model, tools=[calculator])
await agent.run(io_channels=[(sender_function, receiver_function)])

Agent-Driven Design:

BidirectionalAgent is the primary interface users interact with
AudioIO is pure hardware abstraction implementing BidirectionalIO protocol
IO channels passed to run() method, not constructor

API Simplifications:

BidirectionalIO Protocol - Clean interface with input_channel() and output_channel() methods
Type aliases - BidirectionalInput for cleaner signatures with modern | syntax
IO channel independence - AudioIO no longer depends on agent instance
Flexible run method - Accepts list of BidirectionalIO channels or transport tuples

Future Roadmap

This agent-driven architecture with BidirectionalIO abstraction enables:

VideoIO for real-time video processing
WebSocketIO for remote clients
SensorIO for IoT and sensor data
Custom transport layers via the standardized BidirectionalIO protocol

src/strands/tools/caller.py

src/strands/experimental/bidirectional_streaming/agent/agent.py

pgrayy · 2025-10-24T17:51:57Z

src/strands/experimental/bidirectional_streaming/agent/agent.py

            await stop_bidirectional_connection(self._session)
            self._session = None

    def _validate_active_session(self) -> None:


Can we build this into Session? Then in the agent init, we can do something like:

self._session = Session()

And in agent start, we can do:

self._session.start()

From here, session itself can raise a "not started" error if we try to use a member method that requires activation (e.g., send_interrupt).

Also, would we be able to rename to something like self._loop_connection to avoid overloading session.

i think this will work -- I am going to leave this comment open to do this change once we merge in the agent class, and the bidirectional event loop so that this trivial change can be made on top. Currently both of the PRs are open and the loops are in different state.

src/strands/experimental/bidirectional_streaming/agent/agent.py

src/strands/tools/caller.py

src/strands/experimental/bidirectional_streaming/agent/agent.py

src/strands/agent/agent.py

src/strands/tools/caller.py

src/strands/experimental/bidirectional_streaming/agent/agent.py

src/strands/experimental/bidirectional_streaming/tests/optimized_example.py

src/strands/experimental/bidirectional_streaming/io/audio.py

src/strands/experimental/bidirectional_streaming/adapters/audio_adapter.py

…ntation. Will be added later when implementation is added.

src/strands/experimental/bidirectional_streaming/tests/test_bidi.py

src/strands/experimental/bidirectional_streaming/agent/agent.py

src/strands/experimental/bidirectional_streaming/tests/test_bidi.py

mkmeral · 2025-11-06T15:11:15Z

For interface. First, I'll assume we wont have audio config as part of the interface, as it's related to audio adapter.

For methods, given that we have a class, do we need a method that returns callable (e.g. create_input). Why don't we have just have send/receive methods as part of the interface

src/strands/experimental/bidirectional_streaming/agent/agent.py

src/strands/experimental/bidirectional_streaming/tests/test_bidi.py

awsarron · 2025-11-06T15:14:02Z

src/strands/experimental/bidirectional_streaming/tests/test_bidi.py

+    async with BidirectionalAgent(model=model, tools=[calculator]) as agent:
+        print("New BidirectionalAgent Experience")
+        print("Try asking: 'What is 25 times 8?' or 'Calculate the square root of 144'")
+        await agent.connect()


This looks good but it's difficult for me to really understand what consumers will really experience. Would be great to see more full-fledged examples to truly understand the devex more than a simple toy example:

Integrated with a client UI

One component as part of a larger system (e.g. in-vehicle assistant)

Integrated with non-bidi agents

Real-time interrupts

Listening for activation phrases - "Hey Alexa"

src/strands/experimental/bidirectional_streaming/agent/agent.py

- Remove adapter from constructor - Implement BidirectionlIO interface - Add adapter the run() method

src/strands/agent/agent.py

pgrayy · 2025-11-07T19:13:07Z

src/strands/experimental/bidirectional_streaming/agent/agent.py

@@ -1,244 +1,136 @@
 """Bidirectional Agent for real-time streaming conversations.


I personally think Bidi is an acceptable shorthand and so I would suggest every package, class, method, and variable use that naming convention. Examples:

src/strands/experimental/bidi/agent/agent.py

class BidiAgent

The benefit here is that customers have less to type. I also notice this is a convention used out in the wild. Examples:

https://grpc.io/docs/what-is-grpc/core-concepts/

https://developers.googleblog.com/en/beyond-request-response-architecting-real-time-bidirectional-streaming-multi-agent-system/

Side quest: Would you be able to come up with a distinguishing name for normal agents? I have been calling them converse stream agents so I can talk "bidi agent" vs "converse stream agent". It would be nice to have something shorter though. You can then share the name with the team so going forward it is easier for everyone to talk about.

I personally don't like "bidi" shorthand in the repo for customer facing things. That said, I'd like to avoid bidirectional name in most of the stuff, unless it'll be used together with other agents/packages, and it requires some clarification. I think this only applies to bidi agent for now.

For the rest, they are under bidi namespace, and if customers want to have it more explanatory they can just use import aliases

I think it is good to have the prefix since customers are going to be importing the components top-level:

from strands.experimental.bidi.models import GeminiBidiModel # vs from strands.experimental.bidirectional.models import GeminiModel

The second import is longer

The second import has a less descriptive name that'll be used in the customer code.

In Python, it is conventional for people to use shorthand names (e.g., import numpy as np). Here we just go ahead shorten on the customer's behalf.

bidi is a commonly accepted shorthand for bidirectional.

For these reasons, I feel pretty strongly about utilizing the Bidi/bidi prefix.

src/strands/experimental/bidirectional_streaming/agent/agent.py

pgrayy · 2025-11-07T19:49:27Z

src/strands/experimental/bidirectional_streaming/agent/agent.py

-        Unified method for sending text, audio, and image input to the model during
-        an active conversation session.
-        
+        logger.debug("Conversation start - initializing connection")


Nit: I would recommend having an LLM adjust the formatting of the logs to follow the guidelines outlined in https://github.com/strands-agents/sdk-python/blob/main/STYLE_GUIDE.md.

Will apply these changes in a follow-up PR

src/strands/experimental/bidirectional_streaming/agent/agent.py

pgrayy · 2025-11-07T20:02:41Z

src/strands/experimental/bidirectional_streaming/agent/agent.py

-from ..event_loop.bidirectional_event_loop import start_bidirectional_connection, stop_bidirectional_connection
+from ....types.tools import ToolResult, ToolUse, AgentTool
+
+from ..event_loop.bidirectional_event_loop import BidirectionalAgentLoop


Let's go ahead and make BidiAgentLoop private (i.e., call it _BidiAgentLoop).

+1 It might be nice to see what we have private vs public in all bidi code

Will do this in a follow-up commit in the PR here: #10

src/strands/experimental/bidirectional_streaming/agent/agent.py

src/strands/experimental/bidirectional_streaming/io/audio.py

src/strands/experimental/bidirectional_streaming/types/audio_io.py

mkmeral · 2025-11-09T13:25:02Z

src/strands/experimental/bidirectional_streaming/agent/agent.py

@@ -1,244 +1,136 @@
 """Bidirectional Agent for real-time streaming conversations.


I personally don't like "bidi" shorthand in the repo for customer facing things. That said, I'd like to avoid bidirectional name in most of the stuff, unless it'll be used together with other agents/packages, and it requires some clarification. I think this only applies to bidi agent for now.

For the rest, they are under bidi namespace, and if customers want to have it more explanatory they can just use import aliases

mkmeral · 2025-11-09T13:25:35Z

src/strands/experimental/bidirectional_streaming/agent/agent.py

-from ..event_loop.bidirectional_event_loop import start_bidirectional_connection, stop_bidirectional_connection
+from ....types.tools import ToolResult, ToolUse, AgentTool
+
+from ..event_loop.bidirectional_event_loop import BidirectionalAgentLoop


+1 It might be nice to see what we have private vs public in all bidi code

src/strands/experimental/bidirectional_streaming/agent/agent.py

…cies

pgrayy · 2025-11-10T00:31:23Z

src/strands/experimental/bidirectional_streaming/__init__.py

    "InterruptionDetectedEvent",
    "BidirectionalStreamEvent",
    "VoiceActivityEvent",
    "UsageMetricsEvent",


I would say let's not expose the Events top level. I think for now it is good to encourage users to import from the subpackage to avoid confusion. We have both stream events and hook events and so it'll be good to help users distinguish. Also, I think it will be good to be consistent with what we expose top level for the uni agent

So this was a discussion I had with @zastrowm

I think we want to start exposing these events. There are a couple of reasons. First TS consistency, as they will also do that.

Also it allows us to expose more complicated events that can be used in different places. For example, error event was a problem, because if you want to make the event (they were dicts) serializable, then you cannot include exception object, but using the typed dicts, we can have our cake and eat it too :D

The class below is technically still a dict (so can be send to websockets, and used with json.dumps, etc), but it also gives customers the ability to access exceptions directly, if they want to raise a new exception from this one. Essentially exposing events allows us to do this.

Additionally, we can start to expose the typed dicts in agent and multi-agent as they'd be backwards compatible: they extend dicts, they are dicts

class ErrorEvent(TypedEvent): """Error occurred during the session. Stores the full Exception object as an instance attribute for debugging while keeping the event dict JSON-serializable. The exception can be accessed via the `error` property for re-raising or type-based error handling. Parameters: error: The exception that occurred. details: Optional additional error information. """ def __init__( self, error: Exception, details: Optional[Dict[str, Any]] = None, ): # Store serializable data in dict (for JSON serialization) super().__init__( { "type": "bidirectional_error", "message": str(error), "code": type(error).__name__, "details": details, } ) # Store exception as instance attribute (not serialized) self._error = error @property def error(self) -> Exception: """The original exception that occurred. Can be used for re-raising or type-based error handling. """ return self._error

Things to take into consideration:

In exposing typed-events, I would expect all events coming out of BiDi to be typed, not a subset. Is that the case?

For consistency, UniDi events should also be typed, not just BiDi. Open question whether that's at the same time or as a follow-up

If we start publishing typed events, the entire class + shape should be bar-raised, not just the dict emitted. In the past we haven't cared about the concrete members (or class naming, or init members) because they weren't public apis - if we're making them public they should be bar-raised

In exposing typed-events, I would expect all events coming out of BiDi to be typed, not a subset. Is that the case?

It should be the case, I will double check

For consistency, UniDi events should also be typed, not just BiDi. Open question whether that's at the same time or as a follow-up

Yes, I agree. I'd prefer a followup, not to overburden ourselves right before re:invent. That said, we should also do a poc to make sure such change won't break existing customers. I'd expect so, but it's better to make sure

If we start publishing typed events, the entire class + shape should be bar-raised, not just the dict emitted. In the past we haven't cared about the concrete members (or class naming, or init members) because they weren't public apis - if we're making them public they should be bar-raised

That makes sense. I will work on a doc for it to show what are the events/types we have

pgrayy · 2025-11-10T00:33:02Z

src/strands/experimental/bidirectional_streaming/__init__.py

@@ -1,8 +1,11 @@
 """Bidirectional streaming package."""

 # Main components - Primary user interface


Follow up: Per discussion, we are Rachit is going to apply the Bidi/bidi prefix in a follow PR.

pgrayy · 2025-11-10T00:38:54Z

src/strands/experimental/bidirectional_streaming/agent/agent.py

@@ -1,244 +1,136 @@
 """Bidirectional Agent for real-time streaming conversations.


I think it is good to have the prefix since customers are going to be importing the components top-level:

from strands.experimental.bidi.models import GeminiBidiModel # vs from strands.experimental.bidirectional.models import GeminiModel

The second import is longer

The second import has a less descriptive name that'll be used in the customer code.

In Python, it is conventional for people to use shorthand names (e.g., import numpy as np). Here we just go ahead shorten on the customer's behalf.

bidi is a commonly accepted shorthand for bidirectional.

For these reasons, I feel pretty strongly about utilizing the Bidi/bidi prefix.

pgrayy · 2025-11-10T00:43:54Z

src/strands/experimental/bidirectional_streaming/agent/agent.py

-        model: BidirectionalModel,
-        tools: list | None = None,
+        model: BidirectionalModel| str | None = None,
+        tools: list[str| AgentTool| ToolProvider]| None = None,


Nit: list[...]| vs list[...] |

src/strands/experimental/bidirectional_streaming/agent/agent.py

pgrayy · 2025-11-10T01:04:03Z

src/strands/experimental/bidirectional_streaming/agent/agent.py

-        Args:
-            sender: Async callable that sends events to the client (e.g., websocket.send_json).
-            receiver: Async callable that receives events from the client (e.g., websocket.receive_json).
+    async def run(self, io_channels: list[BidiIO | tuple[Callable, Callable]]) -> None:


Follow up: Let's try to define a more specific Callable type for the tuple.

pgrayy · 2025-11-10T01:04:37Z

src/strands/experimental/bidirectional_streaming/agent/agent.py

+        if not io_channels:
+            raise ValueError("io_channels parameter cannot be empty. Provide either an IO channel or (sender, receiver) tuple.")
+
+        transport = io_channels[0]


We can do a for loop correct? Also, let's use io or io_channel every where in place of transport and adapter.

pgrayy · 2025-11-10T01:07:22Z

src/strands/experimental/bidirectional_streaming/agent/agent.py

+
+    async def _run_with_transport(
+        self,
+        transport: BidiIO | tuple[Callable, Callable],


Let's convert the tuple to a BidiIO so that we don't need to build conditions around the type in this method.

pgrayy · 2025-11-10T01:10:01Z

src/strands/experimental/bidirectional_streaming/io/audio.py

+                - input_channels (int): Input channels (default: 1)
+                - output_channels (int): Output channels (default: 1)
+        """
+        if pyaudio is None:


Looks like we don't need this check anymore since we import pyaudio top-level.

you're right -- will remove

feat: (Agent): Finalize Bidirectional Agent class

883f6fc

mehtarac had a problem deploying to auto-approve October 24, 2025 14:59 — with GitHub Actions Failure

JackYPCOnline reviewed Oct 24, 2025

View reviewed changes

src/strands/tools/caller.py Show resolved Hide resolved

pgrayy reviewed Oct 24, 2025

View reviewed changes

src/strands/experimental/bidirectional_streaming/agent/agent.py Show resolved Hide resolved

feat: (Agent): Finalize Bidirectional Agent class

23d8da8

mehtarac had a problem deploying to auto-approve October 28, 2025 13:48 — with GitHub Actions Failure

pgrayy reviewed Oct 28, 2025

View reviewed changes

src/strands/experimental/bidirectional_streaming/agent/agent.py Outdated Show resolved Hide resolved

mkmeral reviewed Oct 28, 2025

View reviewed changes

src/strands/tools/caller.py Show resolved Hide resolved

Unshure reviewed Oct 28, 2025

View reviewed changes

src/strands/experimental/bidirectional_streaming/agent/agent.py Show resolved Hide resolved

Merge branch 'main' into bar_raise_agent

1bccf9b

github-actions bot added the size/m label Nov 3, 2025

mehtarac had a problem deploying to auto-approve November 3, 2025 15:09 — with GitHub Actions Failure

feat: (Agent): Finalize Bidirectional Agent class

863f04f

mehtarac had a problem deploying to auto-approve November 4, 2025 15:01 — with GitHub Actions Failure

github-actions bot added size/l and removed size/m labels Nov 4, 2025

Unshure reviewed Nov 4, 2025

View reviewed changes

src/strands/experimental/bidirectional_streaming/tests/optimized_example.py Outdated Show resolved Hide resolved

src/strands/experimental/bidirectional_streaming/tests/optimized_example.py Outdated Show resolved Hide resolved

mkmeral reviewed Nov 4, 2025

View reviewed changes

src/strands/experimental/bidirectional_streaming/tests/optimized_example.py Outdated Show resolved Hide resolved

src/strands/experimental/bidirectional_streaming/io/audio.py Show resolved Hide resolved

Unshure reviewed Nov 4, 2025

View reviewed changes

src/strands/experimental/bidirectional_streaming/adapters/audio_adapter.py Outdated Show resolved Hide resolved

mehtarac added 3 commits November 5, 2025 12:23

update dev experience and change verbage from session to connection

a91e41b

rename validate_connection

1e9d185

remove hooks and otel parameters from constructor for focused impleme…

6d8c355

…ntation. Will be added later when implementation is added.

github-actions bot added size/l and removed size/l labels Nov 5, 2025

mehtarac had a problem deploying to auto-approve November 5, 2025 18:33 — with GitHub Actions Failure

hatch fmt --formatter

c5328e0

github-actions bot added size/l and removed size/l labels Nov 5, 2025

github-actions bot added the size/l label Nov 6, 2025

Update imports

2a2861b

github-actions bot added size/l and removed size/l labels Nov 6, 2025

mehtarac had a problem deploying to auto-approve November 6, 2025 15:00 — with GitHub Actions Failure

awsarron reviewed Nov 6, 2025

View reviewed changes

src/strands/experimental/bidirectional_streaming/tests/test_bidi.py Show resolved Hide resolved

mkmeral reviewed Nov 6, 2025

View reviewed changes

src/strands/experimental/bidirectional_streaming/agent/agent.py Outdated Show resolved Hide resolved

src/strands/experimental/bidirectional_streaming/agent/agent.py Outdated Show resolved Hide resolved

awsarron reviewed Nov 6, 2025

View reviewed changes

src/strands/experimental/bidirectional_streaming/agent/agent.py Show resolved Hide resolved

awsarron reviewed Nov 6, 2025

View reviewed changes

src/strands/experimental/bidirectional_streaming/tests/test_bidi.py Show resolved Hide resolved

Unshure reviewed Nov 6, 2025

View reviewed changes

src/strands/experimental/bidirectional_streaming/agent/agent.py Outdated Show resolved Hide resolved

src/strands/experimental/bidirectional_streaming/tests/test_bidi.py Outdated Show resolved Hide resolved

awsarron reviewed Nov 6, 2025

View reviewed changes

pgrayy reviewed Nov 6, 2025

View reviewed changes

src/strands/experimental/bidirectional_streaming/agent/agent.py Outdated Show resolved Hide resolved

Update implementation based on bar-raising

0a63829

- Remove adapter from constructor - Implement BidirectionlIO interface - Add adapter the run() method

github-actions bot added size/xl and removed size/l labels Nov 7, 2025

mehtarac had a problem deploying to auto-approve November 7, 2025 16:11 — with GitHub Actions Failure

pgrayy reviewed Nov 7, 2025

View reviewed changes

src/strands/experimental/bidirectional_streaming/io/audio.py Show resolved Hide resolved

mkmeral reviewed Nov 9, 2025

View reviewed changes

mehtarac added 3 commits November 9, 2025 16:55

Updates: make ToolCaller private, minor updates based on PR comments

8d9a298

Update: file names, locations, and ToolCaller class name

73416d7

Update method names imports for io.py and audio.py and their dependen…

a49273b

…cies

github-actions bot added size/xl and removed size/xl labels Nov 9, 2025

mehtarac had a problem deploying to auto-approve November 9, 2025 22:23 — with GitHub Actions Failure

pgrayy reviewed Nov 10, 2025

View reviewed changes

mehtarac mentioned this pull request Nov 10, 2025

Event Types #20

Merged

pgrayy approved these changes Nov 10, 2025

View reviewed changes

mehtarac merged commit fd11282 into main Nov 10, 2025
1 of 13 checks passed

		@@ -1,244 +1,136 @@
		"""Bidirectional Agent for real-time streaming conversations.

		@@ -1,8 +1,11 @@
		"""Bidirectional streaming package."""

		# Main components - Primary user interface

feat: (Agent): Finalize Bidirectional Agent class #12

feat: (Agent): Finalize Bidirectional Agent class #12

Uh oh!

Conversation

mehtarac commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

BidirectionalIO Interface:

Protocol:

Audio Implementation:

Purpose:

Before: (200+ lines)

After: (2-3 lines)

Agent-Driven Design:

API Simplifications:

Future Roadmap

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mkmeral commented Nov 6, 2025

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mehtarac commented Oct 24, 2025 •

edited

Loading