Study: OpenAI's ChatGPT and GPT-4 'memorized' these books

from Theregister 11 months ago

Boffins at the University of California, Berkeley, have delved into the undisclosed depths of OpenAI's ChatGPT and the GPT-4 large language model at its heart, and found they're trained on text from copyrighted books.Academics Kent Chang, Mackenzie Cramer, Sandeep Soni, and David Bamman describe their work in a paper titled, "Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4."
"We find that OpenAI models have memorized a wide collection of copyrighted materials, and that the degree of memorization is tied to the frequency with which passages of those books appear on the web," the researchers explain in their paper.

Read at Theregister

#information #researchers #applications #machine-learning #differently #transparent #associate-professor #science-fiction #archaeology #uc-berkeley

[

]

[

...

]

Study: OpenAI's ChatGPT and GPT-4 'memorized' these booksStudy: OpenAI's ChatGPT and GPT-4 'memorized' these books Briefly

Study: OpenAI's ChatGPT and GPT-4 'memorized' these books
Study: OpenAI's ChatGPT and GPT-4 'memorized' these books
Briefly