GitHub Copilot Custom Instructions and Risks

GitHub Copilot has the capability to be augmented with custom instructions coming from the current repo, via the .github/copilot-instructions.md file.

copilot instructions

Pillar Security recently highlighted the risks associated with rules files. Their post discusses custom Cursor rules in ./cursor/rules ending in .mdc.

If you watch the demos, you’ll notice that they also have a GitHub Copilot demo which uses the GitHub specific copilot-instructions.md file.

GitHub Copilot Custom Instructions File

I’ve also been experimenting with the .github/copilot-instructions.md file a bit recently.

Here’s a demonstration I’ve been using: Github Instructions Contents

Now, if you ask Copilot to explain some code it adds these custom instructions to the prompt context, leading to the following inference result: Github Instructions

Luckily, GitHub mitigated the common 0-click image rendering data leakage issue last year, and for hyperlinks there is a confirmation dialog when clicking, unless the destination domain is the trusted sites list of VS Code.

This screenshot shows the mitigation for navigating to random phishing sites in action: Github Instructions - Default Hyperlink Mitigation

Developers might still be tricked, and there are other attack avenues… like adding backdoor code!

Backdooring Instructions

The most significant issue is that a single sneaky line in the instructions file can add backdoor code.

Here is a demo on how this works in Go, check this out:

Github Instructions - Add Code

So, Greetings from Jia Tan, I guess.

Pillar Security does a great job in highlighting additional threats and considerations, such as hidden Unicode characters.

It’s important to remain aware of what information is added to the prompt context, and the introduction of additional features by vendors can catch users and developers by surprise. Stay alert for any modifications to your custom instruction file.

And remember that Anthropic never mitigated usage of hidden Unicode Tags, and Claude is interpreting such hidden characters as instructions.

References