Programming

Google researchers force ChatGPT to share training data

Google researchers force ChatGPT to share training data

The discovery reached by some Google researchers may, in our opinion, represent a bit of a thorn in the side of OpenAI. Do you remember the firm stance taken by the company which observed that the contents produced by ChatGPT and the underlying generative models cannot be the subject of copyright disputes? OpenAI explained that its models of artificial intelligence they learn from existing data, as a real person would, but they do not memorize the information. The “secret” consists in establishing probabilistic relationships, through a system of weights, between the terms encountered during the learning phase.

But try examining ChatGPT’s response to this prompt unusual: the generative model first follows the indication provided in input and then begins to produce texts that clearly display the indication “copyright“, together with the reference to the corresponding rights holder.

The Google team’s discovery certainly won’t help legal dispute that a group of publishers, authors and writers have brought against OpenAI and which will now begin in the courtrooms overseas. It is easy to hypothesize, as we explain below, that OpenAI will “declassify” the incident to a bug or to a system vulnerability, already resolved or in the process of being fixed.

ChatGPT returns training data: quite a gaffe

As Google researchers explain in detail in the document Extract Training Data from ChatGPTit was enough to ask the chatbot to always repeat the same word (for example the terms “company” or “poem“) to mislead the application and lead it to share confidential information, such as the training data used by OpenAI.

The discovery is significant because ChatGPT is based on a model used in production. Asking the artificial intelligence OpenAI to repeat a specific word endlessly, ChatGPT is prone to revealing personal information, such as email addresses and phone numbers. Furthermore, the scholars note, approximately 5% of ChatGPT responses in the most advanced configuration consist of a direct copy of 50 consecutive tokens taken from its set of training data.

Google researchers estimate that it is possible extract about a gigabyte of training data from ChatGPT with a low monetary investment. They also urge both OpenAI and other companies developing AI-based solutions to test the models in production to verify that systems based on them do not contain vulnerabilities.

The correction of individuals exploit (as in the specific case of attacks “company” e “poem“) absolutely does not equate to a decisive intervention on the underlying vulnerabilities. While a filtro in output can prevent a specific attack, resolving the vulnerability that leads to the storage of training data is significantly more complex.

In the document drawn up by the researchers there are also references to training data which correspond to information published on the Web. The aim is to demonstrate that this is real data and not sentences generated in a pseudorandom way.

Leave a Reply

Your email address will not be published. Required fields are marked *