Amp Code: Invisible Prompt Injection Fixed by Sourcegraph

Posted on Aug 16, 2025

#llm #agents #month of ai bugs

In this post we will look at Amp, a coding agent from Sourcegraph. The other day we discussed how invisible instructions impact Google Jules.

Turns out that many client applications are vulnerable to these kinds of attacks when they use models that support invisible instructions, like Claude.

Invisible Unicode Tag Characters Interpreted as Instructions

We have talked about hidden prompt injections quite a bit in the past, and so I’m keeping this short.

Amp was vulnerable to interpreting invisible Unicode Tag characters as instructions. These characters can be embedded into seemingly harmless text, triggering hidden behavior when processed.

An end-to-end exploit shows how these characters can be used to:

Trigger a command to search and dump an environment variable (grep).
Encode the environment variable into a URL query parameter.
Invoke the read_web_page or trigger image rendering via markdown syntax to exfiltrate the secret.

Here is a graphic that shows the individual steps:

Notice how I added the text “Do this now:” as visible preamble, interestingly this helped quite a bit in making it more reliable.

Also, the reason the grep command looks a little cryptic is because when directly asking for an API key, models often refuse to include the secret into a URL. By being a little bit more vague, e.g. searching for .e* and ANT we still get the desired result of finding the ANTHROPIC_API_KEY that the demo leaks.

Vulnerable Models

I wasn’t sure which model Amp is using, but I’m assuming it’s Claude or Gemini as those two are pretty good in interpreting hidden Unicode Tag characters as instructions. Grok does it also. OpenAI removes these characters.

All major model vendors are aware of this behavior, I reported to each of them that their models interpret to humans invisible Unicode Tag code points as instructions long time ago - most of them 16+ months ago. Originally, this was discovered by Riley Goodside with ChatGPT by the way. And OpenAI seems the only vendor that mitigated it at the model core/API layer.

If you build an application using LLMs, make sure to consider and mitigate this attack vector.

Recommendations

Here are some of the recommendations provided to the Amp team, in case it’s useful for your own applications or reports:

Strip or neutralize Unicode Tag characters before processing any input
Add visual and technical safeguards against invisible prompts
Include automated detection of suspicious Unicode usage in prompt injection monitors
Human-in-the-loop before navigating to untrusted third-party domains
Mitigate downstream data exfiltration vulnerabilities

Responsible Disclosure

After reaching out to Sourcegraph on June 14, 2025, the vulnerability was quickly addressed, so make sure to run the latest version. As far as I can tell it’s now just sanitizing the input, which seems a simple and effective fix.

Kudos to the Amp team for addressing this promptly.

Conclusion

This post showed how an adversary can use invisible Unicode Tag characters as instructions to hijack an agent and invoke capabilities.

Many models continue to, and are even getting better at, interpreting hidden Unicode Tag code points as instructions. Hence it is important to raise awareness as it has now become a common AI application security vulnerability and application developers need to apply mitigations.

Cheers.

Embrace The Red