-
Notifications
You must be signed in to change notification settings - Fork 605
[BUG] Function Key Handling For Stagehand Agent #686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Summary
Fixed key combination handling in the Stagehand agent to properly manage CTRL+A and DEL sequences when clearing input fields, particularly for OpenAI's computer-use-preview model.
- Added special case in
/lib/handlers/agentHandler.ts
to handle meta key combinations by pressing keys simultaneously - Modified key mapping to use Meta on macOS and Control on other platforms
- Fixed issue where CTRL+A was incorrectly typing extra 'A' character in input fields
- LIMITATION: Implementation only works on Linux and macOS, not Windows
- IMPORTANT: Ensure users are aware of platform-specific behavior for key combinations
💡 (1/5) You can manually trigger the bot by mentioning @greptileai in a comment!
1 file(s) reviewed, 2 comment(s)
Edit PR Review Bot Settings | Greptile
lib/handlers/agentHandler.ts
Outdated
const playwrightKeys = keys.map((key) => { | ||
if (key.includes("CTRL")) return "Meta"; | ||
if (key.includes("CMD") || key.includes("COMMAND")) | ||
return "Meta"; | ||
return this.convertKeyName(key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Always mapping CTRL to Meta will break Windows functionality. Need to handle Windows platform separately instead of ignoring it.
const playwrightKeys = keys.map((key) => { | |
if (key.includes("CTRL")) return "Meta"; | |
if (key.includes("CMD") || key.includes("COMMAND")) | |
return "Meta"; | |
return this.convertKeyName(key); | |
const playwrightKeys = keys.map((key) => { | |
if (key.includes("CTRL")) return process.platform === "darwin" ? "Meta" : "Control"; | |
if (key.includes("CMD") || key.includes("COMMAND")) | |
return "Meta"; | |
return this.convertKeyName(key); |
lib/handlers/agentHandler.ts
Outdated
// Press all keys down in sequence | ||
for (const key of playwrightKeys) { | ||
await this.stagehandPage.page.keyboard.down(key); | ||
} | ||
|
||
// Small delay to ensure the combination is registered | ||
await new Promise((resolve) => setTimeout(resolve, 100)); | ||
|
||
// Release all keys in reverse order | ||
for (const key of playwrightKeys.reverse()) { | ||
await this.stagehandPage.page.keyboard.up(key); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Key release order should match press order for consistent behavior. Currently pressing in forward order but releasing in reverse.
why
OpenAI's computer-use-preview model tends to clear any input fields before typing in by running CTRL+A and DEL. However, the agent handler does not handle the meta keys properly resulting in extra 'A' typed in the field.
what changed
a special case for keys arrays including the meta key in the agent handler
caveats
[IMPORTANT] This PR completely ignores Windows machines and only works for Linux and MacOS.