-
Notifications
You must be signed in to change notification settings - Fork 621
feat: add press_key tool for keyboard input #246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add WebP format to screenshot tool, providing superior compression compared to JPEG. Also fixes bug where saveTemporaryFile always saved files with .png extension regardless of format.
Add press_key tool that supports single keys and key combinations with modifiers. Features: - Single key press (e.g., "Enter", "Escape", "Tab") - Key combinations with modifiers (e.g., "Control+A", "Control+Shift+T") - Edge case handling (e.g., "Control++" for plus key with modifier) Implementation: - Added splitKeyCombo() helper function to parse key combinations - Handles modifier keys: Control, Shift, Alt, Meta - Presses modifiers in order, releases in reverse order - Includes comprehensive tests for all scenarios - Updated documentation with 27 tools (was 26)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the PR! left a few comments
- Move splitKeyCombo to src/third_party/playwright/keyboard.ts - Add Playwright LICENSE and README with source attribution - Replace @ts-expect-error with proper KeyInput type casts - Update press_key description to clarify when to use vs fill() - Standardize input tool schema descriptions using uidSchema
Updated to address feedback - moved keyboard utilities to third_party/playwright with proper attribution, replaced type suppressions with casts, and clarified when to use press_key vs fill(). |
schema: { | ||
from_uid: z.string().describe('The uid of the element to drag'), | ||
to_uid: z.string().describe('The uid of the element to drop into'), | ||
from_uid: z.string().describe('Element uid to drag'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please revert changes to the descriptions of unrelated tools. Does it make sense to extract the uid schema? I would suggest we revert this part as well.
Summary
Adds a new
press_key
tool that enables keyboard input automation, supporting both single keys and key combinations with modifiers.Features
Implementation
splitKeyCombo()
helper function to parse key combinationsTest Results
All tests passing:
Test Plan