Skip to content

Conversation

@tsmithsz
Copy link
Contributor

@tsmithsz tsmithsz commented Jul 29, 2025

Problem

A critical security vulnerability was discovered where attackers can exploit ASCII smuggling to bypass frontend input validation. Users can send prompts that appear empty in the UI but contain hidden instructions encoded using Unicode tag characters (U+E0000-U+E007F range). These manipulated prompts bypass frontend checks and are processed by the backend LLM

Solution

Implemented input sanitization to detect and remove Unicode tag characters and other potentially dangerous invisible characters before processing user input. The sanitization:

  • Removes Unicode tag characters (U+E0000-U+E007F) commonly used in ASCII smuggling

  • Strip other invisible/control characters that could be used for similar attacks

Testing

Screen.Recording.2025-07-29.at.4.35.03.PM.mov
Untitled.mov

License

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@tsmithsz tsmithsz requested a review from a team as a code owner July 29, 2025 23:38
@tsmithsz tsmithsz closed this Jul 29, 2025
@tsmithsz tsmithsz reopened this Jul 30, 2025
@codecov-commenter

This comment was marked as outdated.

@tsmithsz tsmithsz merged commit bf8a1e6 into aws:main Jul 30, 2025
6 checks passed
@tsmithsz tsmithsz deleted the malicious-prompt branch August 1, 2025 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants