OpenAI says it is ‘impossible’ to train AI without using copyrighted works for free
ChatGPT company OpenAI reportedly pleaded to the British parliament to allow it to use copyrighted works for free.
OpenAI told a committee that it was “impossible” to train its artificial intelligence model without using such data.
“Because copyright today covers virtually every sort of human expression – including blog posts, photographs, forum posts, scraps of software code, and government documents – it would be impossible to train today’s leading AI models without using copyrighted materials,” OpenAI said, according to The Telegraph.
“Limiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment, but would not provide AI systems that meet the needs of today’s citizens,” the company said in evidence submitted to the House of Lords communications and digital committee.
OpenAI’s ChatGPT AI tool has become popular since its launch in November 2022 as a language model capable of understanding and generating human-like responses to a wide range of user queries.
The AI model has demonstrated major feats in a short span such as the ability to summarise research studies, answer logical questions, and even crack business school and medical college entrance tests.
However, since ChatGPT’s launch, several companies such as The New York Times as well as celebrities and authors like Sarah Silverman, Margaret Atwood, John Grisham and George RR Martin have sued the AI firm for using their text without permission to train the AI system.
The Times alleged that “millions” of its news articles were used to train ChatGPT in a “massive copyright infringement, commercial exploitation and misappropriation” of the paper’s intellectual property, and that the AI tool now competes with the newspaper as an information source.
“If Microsoft and OpenAI want to use our work for commercial purposes, the law requires that they first obtain our permission. They have not done so,” The New York Times said.
Without the use of such copyrighted work, OpenAI “would have a vastly different commercial product,” Rachel Geman, an attorney in the class action suit filed against OpenAI by the Authors’ Guild and 17 authors, said.
“Defendants’ decision to copy authors’ works, done without offering any choices or providing any compensation, threatens the role and livelihood of writers as a whole,” Ms Geman said.
OpenAI meanwhile said it was attempting to make new partnerships with publishers, striking deals with the Associated Press and media giant Axel Springer to gain access to their content.
“We respect the rights of content creators and owners and are committed to working with them to ensure they benefit from AI technology and new revenue models,” an OpenAI spokesperson said last month.
In the new filing, OpenAI said it complied with copyright laws, adding it believed “legally copyright law does not forbid training”.