Cointime

Download App
iOS & Android

OpenAI Ordered to Hand Over 20M ChatGPT Logs in NYT Copyright Case

In brief

  • The ruling compels OpenAI to provide 20 million chat logs after months of disputes over privacy, preservation, and scope.
  • Judge Ona T. Wang ruled that the sample size is “proportional” to what the case needs to prove whether ChatGPT outputs reproduced Times content.
  • The case joins a growing wave of copyright challenges aimed at how AI labs source and use training data.

A federal magistrate judge has ordered OpenAI to turn over roughly 20 million de-identified ChatGPT logs to The New York Times and other plaintiffs, deepening the AI development company’s exposure to an array of copyright and data governance disputes.

Issued on Wednesday in New York, the order denies OpenAI’s bid to block the production of user-chat records and directs the company to hand over the logs under a protective framework.

The outcome could shape how tech firms such as OpenAI, Anthropic, and Perplexity source training data, license content, and build guardrails around and over what their systems can output.

While the court “recognizes that the privacy considerations of OpenAI’s users are sincere,” such considerations “are only one factor in the proportionality analysis, and cannot predominate where there is clear relevance and minimal burden,” U.S. Magistrate Judge Ona T. Wang wrote.

The order stems from the Times’ ongoing lawsuit, which alleges that OpenAI’s models were trained on copyrighted news content without permission. It was first brought forward in December 2023.

In January last year, OpenAI challenged the NYT’s claims and filed a countersuit, claiming that the publication was not “telling the full story.”

The court later found that the 20 million chat log samples in question are “proportional to the needs of the case” to assess whether ChatGPT outputs copied the NYT’s material.

Over the past year, the dispute has intensified, with plaintiffs pressing for broad access to output data, and OpenAI warning that expansive production of these materials would raise privacy and operational burdens.

In June, OpenAI faced another setback when the court ordered the company to keep a wide range of ChatGPT user data for the lawsuit, including chats users may have already deleted.

Months later, in October, the dispute resurfaced, with the court flagging OpenAI’s October 20 filing (ECF 679) that challenged the production of the 20 million log sample, and ordered both sides to submit clarifications on why they disagree.

At the time, the judge pressed the parties to explain how the fight related to earlier concerns over deleted logs and whether OpenAI had backed away from prior agreements on what it previously claimed it would turn over.

Germany’s national music rights organization secured a partial but decisive win against OpenAI after a Munich court ruled that ChatGPT’s underlying models unlawfully reproduced copyrighted German song lyrics. The ruling orders OpenAI to cease reproduction, disclose relevant training details, and compensate rights holders. It is not yet final, and OpenAI may appeal. If upheld, the decision could reshape how AI companies source and license creative material in Europe, as regulators weigh broader o...

NewsLaw and Order3 min readVince DioquinoNov 13, 2025

Late last month, OpenAI filed a formal objection asking the district judge to overturn the magistrate judge’s discovery order.

The company argued that the ruling was “clearly erroneous” and “disproportionate,” in that it would force the company to disclose millions of private user conversations, according to a court document shared with Decrypt by an OpenAI representative.

The dispute arises as part of a broader offensive against AI labs, with authors, news organizations, music publishers, and code repositories seeking to test how far existing copyright law extends when models ingest and reproduce protected material.

Courts across the U.S. and Europe are now sorting through similar claims.

Comments

All Comments

Recommended for you

  • S&P 500 Index Set for Rare Nine-Week Winning Streak

    On May 29, hopes that a ceasefire agreement could bring an end to the Middle East conflict have propelled the U.S. stock market towards a rare weekly winning streak record, with a surge in artificial intelligence trading also boosting the market. The S&P 500 index has rebounded nearly 20% from the lows triggered by the war and is poised for its ninth consecutive week of gains, marking the longest winning streak since December 2023. Such a rare occurrence has only happened a few times since 1985. On Friday, the index edged higher, hovering near record highs.

  • Grayscale to Introduce $115 Million HYPE Token Seed Funding for Hyperliquid Staking ETF

    On May 29, Finance Feeds reported that Grayscale is in talks with Hyper Holdings Global LP to sell shares of its proposed Hyperliquid ETF in exchange for approximately 2 million HYPE tokens, valued at about $115 million at current prices, to serve as seed capital before the fund's listing. At the same time, Grayscale has renamed the product to 'Grayscale Hyperliquid Staking ETF', which is set to be listed on NASDAQ under the ticker HYPG. The new staking feature distinguishes it from a traditional spot ETF that solely tracks token prices.

  • BTC Falls Below $73,000

    Market data shows that BTC has fallen below $73,000, currently priced at $72,999.33, with a 24-hour decline of 0.4%. The market is experiencing significant volatility, so please ensure proper risk management.

  • Spot Gold Reaches $4,550/oz, Up 1.20% for the Day

    Spot gold has reached $4,550 per ounce, rising 1.20% for the day.

  • S&P 500 Technology Sector Hits Record High, Up 1.7%

    On May 29, it was reported that the S&P 500 technology sector has reached a historic high, currently up 1.7%.

  • U.S. Stock Indices Open Slightly Higher; Dell Rises Over 30%

    On May 29, U.S. stocks opened with the three major indices slightly higher, with the Dow Jones up 0.18%, the S&P 500 up 0.09%, and the Nasdaq up 0.16%. Dell (DELL.N) surged over 30% as its first-quarter earnings exceeded expectations. Stocks of AI server manufacturers also rose, with Super Micro Computer (SMCI.O) up over 7% and HP (HPQ.N) up over 6%.

  • Musk Denies Reports of SpaceX Lowering IPO Valuation Target to At Least $1.8 Trillion

    On May 29, Musk denied reports that SpaceX had lowered its IPO valuation target to at least $1.8 trillion. (Jin Shi)

  • BTC Surpasses $73,000

    Market data shows that BTC has surpassed $73,000, currently priced at $73,002.41, with a 24-hour increase of 0.04%. The market is experiencing significant volatility, so please ensure proper risk management.

  • ETH Surpasses $2000

    Market data shows that ETH has surpassed $2000, currently priced at $2000.67, with a 24-hour increase of 1.04%. The market is experiencing significant fluctuations, so please ensure proper risk management.

  • Federal Reserve's Paulson: Monetary Policy is Moderately Restrictive and at an Appropriate Level

    On May 29, Federal Reserve's Paulson stated that inflationary pressures have impacted the economy, and that monetary policy is moderately restrictive and at an appropriate level; current inflation is too high, having been elevated even before the onset of the war.