-
Notifications
You must be signed in to change notification settings - Fork 1.5k
feat(pipecat-sdk): add speech-to-speech model support (Gemini Live) #680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary: - Speech-to-speech support - Auto-detect audio frames and inject memories to system prompt for native audio models (Gemini Live, etc.) - Fix memory bloating - Replace memories each turn using XML tags instead of accumulating - Add temporal context - Show recency on search results ([2d ago], [15 Jan]) - New inject_mode param - auto (default), system, or user
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
supermemory-app | dcb95c2 | Commit Preview URL Branch Preview URL |
Jan 18 2026, 03:22 AM |
How to use the Graphite Merge QueueAdd the label Main to this PR to add it to the merge queue. You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
PR Review: Speech-to-Speech Support for Pipecat SDKSummaryThis PR adds intelligent speech-to-speech model support (e.g., Gemini Live) to the Pipecat SDK with three key improvements:
Code Quality ✅Strengths:
Minor Issues:
Security Review ✅No security issues found. Good patterns:
Performance Review ✅Good performance patterns:
Logic & Correctness ReviewCritical Issue 🚨 - Potential Bug: In if isinstance(frame, InputAudioRawFrame):
if not self._audio_frames_detected:
self._audio_frames_detected = True
await self.push_frame(frame, direction)
returnProblem: Once
Suggested Fix: Minor Issue - Edge Case: In for i in range(len(messages) - 1, -1, -1):
msg = messages[i]
if msg.get("role") == "user" and MEMORY_TAG_START in msg.get("content", ""):
messages.pop(i)
breakProblem: If a user legitimately includes Suggested Fix: Use the full Temporal Context Feature Review ✅The
Platform compatibility note: The return f"{dt.day} {dt.strftime('%b')}"Test Coverage 📋According to the instructions, this package requires pytest tests. However, the Recommendation: Add tests for:
Documentation Review ✅The documentation updates in
Breaking Changes
|
| """Format deduplicated memories into a text string for injection.""" | ||
| """Format deduplicated memories into a text string for injection. | ||
| Search results include temporal context (e.g., '3d ago') from updatedAt. | ||
| """ | ||
| sections = [] | ||
|
|
||
| static = memories["static"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: The format_relative_time function uses the %-d strftime directive, which is not supported on Windows, causing date formatting to fail silently and omit temporal context.
Severity: HIGH
Suggested Fix
Replace the platform-specific %-d directive with a cross-platform compatible method. One approach is to use dt.day to get the day as an integer and then format it into the string, for example: dt.strftime(f"{dt.day} %b").
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: packages/pipecat-sdk-python/src/supermemory_pipecat/utils.py#L44-L47
Potential issue: The `format_relative_time` function uses the `%-d` strftime directive
to format dates. This directive is a POSIX-specific extension and is not supported on
Windows, which uses `%#d` instead. When this code runs on a Windows machine, `strftime`
will raise a `ValueError`. A broad `except Exception` block catches this error and
returns an empty string, causing the temporal context (e.g., `[15 Jan]`) to be silently
omitted from memory search results for Windows users. This degrades the feature without
any warning.
Did we get this right? 👍 / 👎 to inform future reviews.

Added native speech-to-speech model support
Summary:
Docs update