Adversarial Prompting: Tutorial and Lab

To learn more about Prompt Engineering and Prompt Injections I put together this tutorial + lab for myself. It is as a Jupyter Notebook to experiement and play around with this novel attack technique, learn and experiment.

The examples reach from simple prompt engineering scenarios, such as changing the output message to a specific text, to more complex adversarial prompt challenges such as JSON object injection, HTML injection/XSS, overwriting mail recipients or orders of an OrderBot and also data exfiltration.

The Colab Notebook is located here.

Its basic, but really fun to play around with.

Tutorial + Lab Walkthrough

Additionally, I recorded this guided explanation of Prompt Engineering Techniques and Prompt Injection challenges to continue raising awareness of this rising problem.

Outline of the video

  • Intro & Setup
  • Summarizations and Extractions
  • GPT 3.5. Turbo vs GPT 4 5:55
  • Inference and JSON Object Injection
  • HTML/XSS + Data Exfiltration Scenarios
  • Expansion Prompts

A quick reminder on why this attack is so powerful:

Bypassing Input Validation

Attack payloads are natural language. This means there are lots of creative ways an adversary can inject malicious data that bypass input filters and web application firewalls.

Leveraging the Power of AI for Exploitation

Depending on scenario attacks can include JSON object injections, HTML injection, Cross Site Scripting, overwriting orders of an order chat bot and even data exfiltration (and many others) all with the power of AI and LLMs.

Thanks.