Image to Prompt Injection with Google Bard

A prompt injection scenario that I, and others, have been wondering about in the past, is the potential risk associated with chatbots being able to analyze images.

Could this ability open up the way for Indirect Prompt Injection attacks?

Recently, Google added the ability to uploading and analyze images with Bard. And it turns out that it is indeed possible to add instructions to an image, and have the Bard follow those instructions.

Here is a demonstration picture doing a Rickroll:

AI Prompt Injection - Rickroll

And this is the result after asking Bard to describe the image:

AI Prompt Injection - Rickroll

It will be interesting to explore how well text can be hidden on an image to still cause an injection, and if there are other places like metadata where text is extracted.

References

  • Base images (bot and banana pictures) generated with Bing Chat (then modified by me)
  • Initial tweet showing proof-of-concept