On December 27th, The New York Times issued a lawsuit against OpenAI and Microsoft for copyright infringement.
The lawsuit alleges that OpenAI and Microsoft’s “generative AI tools rely on large-language models that were built by copying and using millions of the Times’ copyrighted news articles” without permission or payment.
In addition, The New York Times alleges that ChatGPT will regurgitate its content “verbatim” and concludes that the two tech giants and their use of artificial intelligence (AI) are responsible for “billions of dollars in statutory and actual damages.”
This high-profile case comes just after a group of popular U.S. authors, including John Grisham, George R.R Martin, and Jodi Picoult, launched a class action lawsuit against OpenAI back in September last year.
What Does This Mean for OpenAI?
The NYT vs. OpenAI case suggests that AI vendors using intellectual property and copyrighted material to train their models without requesting permission from or paying rights holders could be doing so on borrowed time.
“The lawsuit could set a precedent for legal boundaries around using copyrighted materials in AI development and likely lead to stricter regulations and guidelines,” Joseph Thacker, principal AI engineer and security engineer at SaaS security provider AppOmni, told Techopedia.
“It could also lead to a shift in the tech industry, with companies becoming more cautious in their use of copyrighted materials — maybe even being more cautious of using OpenAI’s services.”
Thacker also noted that tech firms may struggle to prove that their use of copyrighted materials in AI training falls under “fair use” and be forced to prove that their solutions don’t replicate high volumes of original content, which could ultimately make LLMs less useful.
The Start of a New LLM Economy?
While there’s still a long way to go before this case reaches court, many commentators have been quick to highlight what happened to music-sharing service Napster, which was ordered to shut down by the federal government after it was found to have shared copyrighted music as part of a $20 billion lawsuit led by the RIAA.
READ MORE:
- Who Owns OpenAI?
- The Best Open-Source LLMs to Watch
- The Best Generative AI Tools for Beginners
- 12 Highest Paid AI Jobs for 2024
“Read some history of what happened last time when a company—Napster—briefly enabled infringement at mass scale,” wrote founder and CEO of Geometric Intelligence Gary Marcus in a post on X.
“The company went bankrupt, and the world respected copyright law and created a new business model in which artists and publishers got compensated in a new way in a new era,” Marcus said.
Although it would be a stretch to say that OpenAI is at risk of going the way of Napster, The New York Times case is already igniting a conversation not just around the rights of copyright holders but potentially how they’re compensated as part of a new era of LLM-publisher arrangements.
OpenAI and other AI vendors already recognize that partnerships with publishers are critical to avoiding legal exposure.
For example, OpenAI is reportedly offering publishers between $1 million and $5 million to use their new articles to train its LLMs. Apple is also said to be discussing licensing deals with publications like NBC News and IAC worth up to $50 million.
The Long Term Implications
If the lawsuit isn’t settled out of court and OpenAI and/or Microsoft are found in violation of copyright law, then they could be ordered to destroy the offending data and/or fined per infringement. Either of these outcomes would be highly damaging, both financially and competitively.
At this stage, it’s impossible to tell whether this case will continue to its conclusion or reach a settlement. In any case, AI vendors, publishers, and copyright holders will remain in a state of uncertainty around their rights and responsibilities until a legal precedent is established.
When it comes to the tentative partnerships between AI vendors and publishers, The New York Times lawsuit will inevitably give the latter group more leverage so that they can negotiate better financial arrangements for themselves.
The Bottom Line
The NYT vs. OpenAI case has the potential to be critical in defining how AI developers can scrape and process publicly available data. While there’s still a long way to go to see if/when it reaches the finish line, it’s already made clear that the status quo of animosity between publishers and vendors over copyright cannot go on.