Prompt injection: what it is and why ChatGPT is vulnerable

With the expression prompt injection refers to a form of cyber attack that exploits the capabilities of a natural language processing system (NLP, Natural Language Processing) in order to insert unauthorized commands or instructions into the input data.

According to Johann Rehbergerexpert researcher in security issues, ChatGPT Plus would be exposed to phenomena of prompt injection e data exfiltration. In other words, an attacker can inoculate an intentionally designed prompt to elicit responses that put personal data and security at risk. confidential information of other subjects.

ChatGPT leverage OpenAI’s artificial intelligence and generative models to write working programming code. The code thus generated is executed on the cloud, on OpenAI servers, within a protected environment (sandbox).

The ChatGPT Plus sandbox is exposed to prompt injection attacks

As Rehberger explained, the environment sandboxed on which its ChatGPT operation is based, is in fact vulnerable to attacks prompt injection e data exfiltration. This last expression means that a malicious user can acquire data referable to other people’s subjects, such as login credentialsauthorization tokens and other information that should remain secret.

ChatGPT Plus allows the loading of any type of file: just click on the icon that represents a small paperclip. For each chat session, the chatbot creates a new one virtual machine Ubuntu Linux with a home directory called /home/sandbox. The user’s personal files are stored in the folder instead /mnt/data.

Although ChatGPT Plus does not offer a terminal window actual, you can specify how prompt the only command ls /mnt/data to get the list of files present in the user’s folder.

The fact of the matter is that by pasting the URL of a web page containing “ad hoc” instructions, an attacker can extract the user’s personal data from the folder /mnt/data and receive them with a simple request transmitted via URL.

How the prompt injection attack works via a web page

A concrete example? Suppose you have passed a file to ChatGPT Plus env_vars.txt containing API keys and passwords. The attacker could craft a malicious web page containing instructions aimed at convert file data present in the folder/mnt/datatherefore also included env_vars.txtin one format URL-encoded to request automatic sending to a server controlled by the attacker himself.

Simply pasting the URL of the malicious site into the ChatGPT window and pressing Enter, the chatbot could then find and interpret the instructions contained in the page share data user’s personal data stored in the sandbox. The unauthorized sending of information occurs by connecting ChatGPT to a URL like this //[DATI_URL_ENCODED].

The attack could also start from legitimate web pages, exploiting “faulty” plugins or abusing the comment systems.


Please enter your comment!
Please enter your name here