NYT vs. OpenAI: Inside the Copyright Battle Shaping AI's Future

In a secure, internet-isolated room somewhere in the United States, lawyers for The New York Times are examining the source code for ChatGPT under extraordinary security measures. This unprecedented access, ordered by a federal judge, requires government-issued IDs, prohibits personal devices, and mandates that all notes be taken on wiped computers. The examination is central to one of the most significant copyright lawsuits in AI history, pitting the 173-year-old media institution against OpenAI and Microsoft.

OpenAI, now valued at $157 billion, built ChatGPT by training its models on vast quantities of text it obtained without payment—including over 10 million New York Times articles, content from other publications, and countless copyrighted books. The Times argues this constitutes copyright infringement in two ways: the “input” case (illegally using articles for training) and the “output” case (ChatGPT reproducing Times content that readers would otherwise pay to access).

The Times has enlisted Susman Godfrey, the elite law firm that secured Dominion’s $787.5 million settlement from Fox News, to lead the charge. Other newsrooms including The New York Daily News and Mother Jones have joined the case. Separately, prominent authors including George R.R. Martin, Jodi Picoult, and Jia Tolentino have filed their own copyright claims that could affect virtually every author whose work trained AI models if granted class-action status.

Publishers and artists have filed approximately two dozen major copyright lawsuits against generative AI companies, demanding compensation for the economic value that propelled OpenAI to dominance and pushed Microsoft’s valuation beyond $3 trillion. “Developers should pay for the valuable publisher content that is used to create and operate their products,” a Times spokesperson stated.

OpenAI and Microsoft defend themselves using the “fair use” doctrine, arguing their models’ ingestion of articles is legally protected. They claim ChatGPT outputs containing near-verbatim Times articles are “highly anomalous” and unrepresentative of typical usage. The companies compare their situation favorably to Napster, which was sued out of existence for illegally copying millions of songs.

With Congress taking a backseat on AI regulation, courts are expected to set the rules. Legal experts predict these cases—particularly the Times lawsuit and authors’ class actions—could reach the Supreme Court and establish precedent-setting parameters for how large language models can be trained in the United States. The fundamental question: Does AI training create “copies” in a meaningful legal sense, and is the process sufficiently “transformative” to qualify as fair use?

Key Quotes

Developers should pay for the valuable publisher content that is used to create and operate their products. The future success of this technology need not come at the expense of journalistic institutions.

A New York Times spokesperson articulated the core argument of publishers worldwide: that AI companies have built billion-dollar valuations by freely using content that cost millions to produce, and that fair compensation is both legally required and economically necessary for journalism’s survival.

Instead of kids, it was a sophisticated company. And instead of doing it for their own personal use, they were doing it for commercial gain.

Justin Nelson, a Susman Godfrey attorney representing authors in parallel lawsuits, argued that OpenAI’s actions are worse than Napster’s illegal music sharing. While Napster was created by college students, OpenAI is a sophisticated, Microsoft-backed company that used copyrighted content to build a commercial product worth billions.

Journalism is kind of the canary in the coal mine. In the same way that music was the canary back in the Napster days, because people could easily torrent an MP3. But you couldn’t, at that time, easily torrent a film.

Georgetown University Law professor Kristelia García explained why journalism faces AI disruption first: unlike movies or complex creative works, AI can already convincingly mimic journalistic writing, making news organizations the first battleground in what will likely become industry-wide copyright conflicts.

I think that’s going to be the big question at the end of the day that’s going to go all the way up to the Supreme Court. That question of fair use around training data, ingesting and training.

Cleveland State University intellectual property professor Christa Laser identified the central legal question that will likely reach the Supreme Court: whether AI training on copyrighted material constitutes “fair use” or illegal copying, a determination that will shape the entire AI industry’s future.

Our Take

This lawsuit represents a collision between technological innovation and intellectual property rights that will define AI’s trajectory. The extraordinary security measures surrounding ChatGPT’s source code examination underscore how much is at stake—OpenAI is protecting the crown jewels that justify its $157 billion valuation.

The Napster comparison is apt but incomplete. Napster ultimately catalyzed the music industry’s digital transformation into streaming services. Similarly, these lawsuits may force AI companies and content creators toward new licensing models and revenue-sharing arrangements rather than the current “take first, ask permission never” approach.

What’s particularly significant is the potential class-action status for authors’ lawsuits, which could affect millions of creators. A settlement involving equity stakes or ongoing royalties could establish entirely new paradigms for how AI companies compensate the human creativity that powers their models. The outcome will either validate AI companies’ current practices or force a fundamental reckoning with how they source training data.

Why This Matters

This lawsuit represents a defining moment for the AI industry’s future. The outcome will determine whether AI companies can continue training models on copyrighted content without compensation, or whether they must fundamentally restructure their business models to pay content creators. With OpenAI valued at $157 billion and Microsoft exceeding $3 trillion largely due to AI capabilities built on unpaid content, the financial stakes are enormous.

The case could establish Supreme Court precedent affecting every generative AI company and content creator in America. If the Times prevails, AI companies may face billions in damages and be forced to negotiate licensing deals with publishers, authors, and artists—fundamentally changing the economics of AI development. Conversely, a victory for OpenAI could cement “fair use” protections that allow continued unrestricted training.

Beyond immediate parties, this battle will shape how AI interacts with journalism, creative industries, and intellectual property rights for decades. As legal expert Kristelia García notes, “journalism is kind of the canary in the coal mine”—the first industry where AI can convincingly replicate human work. The resolution here will set templates for music, film, art, and other creative fields facing similar AI disruption.

NYT vs. OpenAI: Inside the Copyright Battle Shaping AI's Future

Key Quotes

Our Take

Why This Matters

Recommended Reading

Recommended Reading

Artificial Intelligence: A Modern Approach (4th Edition)

Deep Learning

Hands-On Machine Learning

NYT vs. OpenAI: Inside the Copyright Battle Shaping AI's Future

Key Quotes

Our Take

Why This Matters

Recommended Reading

Recommended Reading

Artificial Intelligence: A Modern Approach (4th Edition)

Deep Learning

Hands-On Machine Learning

Related Stories