Token Smuggling

Token smuggling uses Unicode tricks — zero-width characters, homoglyphs, bidirectional text markers, and other non-visible or deceptive characters — to hide malicious instructions within seemingly normal text. The hidden content is invisible to human reviewers but may be processed by AI models.

How it works

Unicode includes thousands of characters that are invisible (zero-width spaces, soft hyphens, bidirectional markers) or visually identical to common characters (Cyrillic "a" vs Latin "a"). Attackers exploit this by:

Inserting invisible characters that carry semantic meaning when processed by a model
Replacing visible characters with homoglyphs that bypass exact-match detection
Using bidirectional markers to make text display differently than its logical order
Embedding instructions in non-visible Unicode ranges that models may still process

Example

The following is an educational example of a prompt injection payload. It is shown here to help defenders understand and recognize the pattern.

Please review the attached quarterly report.

[This text appears normal, but between these words are
zero-width Unicode characters (U+200B, U+200C, U+200D,
U+FEFF) that encode hidden instructions. A model
processing the raw text may interpret the hidden content
while a human reader sees only the visible text.]

Looking forward to your feedback on the numbers.

In a real attack, the hidden characters would be invisible — the email would appear to be a normal business message.

Variations

Zero-width injection: Encoding instructions using zero-width space (U+200B) and zero-width non-joiner (U+200C) as binary.
Homoglyph substitution: Replacing Latin characters with visually identical Cyrillic, Greek, or mathematical symbols to evade keyword matching (ignоre using Cyrillic i and o instead of Latin i and o).
Bidirectional text: Using right-to-left markers (U+200F, U+202B) to make text display in a different order than its logical sequence.
Invisible text: Using characters in Unicode's formatting ranges (U+2060-U+2064) or tag ranges (U+E0001-U+E007F).
Combining characters: Stacking Unicode combining marks to create visual noise that hides injected content.

Real-world impact

Token smuggling enables:

Completely invisible payloads that pass human review of email content
Bypassing any text-based detection system that operates on visible characters only
Evading copy-paste verification — the hidden content isn't copied when a human selects visible text
Creating emails that look entirely benign but carry hidden injection payloads

Mitigation strategies

Unicode normalization: Apply Unicode normalization (NFC/NFKC) and strip zero-width characters, bidirectional markers, and other non-visible Unicode from email content before processing.
Character allowlisting: Only permit characters from expected Unicode ranges (basic Latin, common punctuation) and flag or remove unexpected characters.
Homoglyph detection: Use confusable detection (Unicode TR39) to identify characters that visually resemble but differ from their expected codepoints.
Raw-text inspection: Analyze the raw byte content of emails, not just the rendered display, to catch hidden characters.
Encoding consistency checks: Flag messages where the character encoding or Unicode composition is inconsistent with normal email content.