Advertisement
Advertisement
Advertisement
Futurism

New Law Would Force AI Companies to Reveal Copyrighted Training Data

Sharon Adarlo
2 min read
Generate Key Takeaways

Training Day

The war over the use of copyrighted material in AI has opened up a new front, Billboard reports, after Representative Adam Schiff (D-CA.) introduced a bill this week that would force companies to reveal if they used copyrighted content in their training data sets for machine learning.

The proposed Generative AI Copyright Disclosure Act, if passed, would require businesses to disclose a list of all the copyrighted content they used in a training data set for any new models at least 30 days before they are made public. For current models already being used by consumers, businesses would still be required to disclose copyrighted content if they alter training data "in a significant manner."

Companies would have to file the disclosure with the Register of Copyrights or face a potential monetary penalty for non-compliance.

Advertisement
Advertisement

"We must balance the immense potential of AI with the crucial need for ethical guidelines and protections," said Schiff in a statement. "My Generative AI Copyright Disclosure Act is a pivotal step in this direction."

Fair Cut

In the same statement, the proposed legislation was applauded by an array of representatives from various creative sectors such as the Recording Industry Association of America and the Hollywood professionals in SAG-AFTRA.

The fight over the use of copyrighted material has already sparked lawsuits against companies like OpenAI and its partner Microsoft, such as by The New York Times, which filed a suit against the two in December.

AI companies and their proponents have claimed that using copyright material in training data sets to generate new content like AI images and songs is an example of fair use, meaning they shouldn't have to pay artists, writers, and other owners of the copyrighted material they use.

Advertisement
Advertisement

But that doesn't sit well with many people in the creative industry.

While the bill doesn't outright ban the use of copyrighted material in AI training data sets, a forced public disclosure may spur more lawsuits from copyright holders — as well as anger and anxiety from a public who are skeptical of AI's expanding use.

If we go by from the loud booing at a pro-AI sizzle reel at a recent SXSW event, things are truly heating up in the court of public opinion and that may chart the future course of the AI industry.

More on AI: When AI Is Trained on AI-Generated Data, Strange Things Start to Happen

Solve the daily Crossword

The daily Crossword was played 12,580 times last week. Can you solve it faster than others?
CrosswordCrossword
Crossword
Advertisement
Advertisement
Advertisement