Token Smuggling

Token smuggling uses Unicode tricks — zero-width characters, homoglyphs, bidirectional text markers, and other non-visible or deceptive characters — to hide malicious instructions within seemingly normal text. The hidden content is invisible to human reviewers but may be processed by AI models.

How it works

Unicode includes thousands of characters that are invisible (zero-width spaces, soft hyphens, bidirectional markers) or visually identical to common characters (Cyrillic "a" vs Latin "a"). Attackers exploit this by:

  1. Inserting invisible characters that carry semantic meaning when processed by a model
  2. Replacing visible characters with homoglyphs that bypass exact-match detection
  3. Using bidirectional markers to make text display differently than its logical order
  4. Embedding instructions in non-visible Unicode ranges that models may still process

Example

The following is an educational example of a prompt injection payload. It is shown here to help defenders understand and recognize the pattern.

Please review the attached quarterly report.

[This text appears normal, but between these words are
zero-width Unicode characters (U+200B, U+200C, U+200D,
U+FEFF) that encode hidden instructions. A model
processing the raw text may interpret the hidden content
while a human reader sees only the visible text.]

Looking forward to your feedback on the numbers.

In a real attack, the hidden characters would be invisible — the email would appear to be a normal business message.

Variations

Real-world impact

Token smuggling enables:

Mitigation strategies

Further reading