Copyright Winter is Coming (to Wikipedia?)

This is a guest post by Jonas Robitscher Matthew Sagg, Jonas Robitscher Professor of Law in Artificial Intelligence, Machine Learning, and Data Science at Emory University Law School. This was originally posted Here, Addressing Judge Stein’s order denying OpenAI’s motion to dismiss in Authors Guild v. OpenAI, Inc., No. 25-MD-3143(SHS) (OTW) (S.D.N.Y. Oct. 27, 2025).

https%3A%2F%2Fsubstack post media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee168b6f 4336 44e6 97e7
A white walker reading Wikipedia in a desolate field (an AI image by Gemini)

A new decision in Authors Guild vs OpenAI Beyond artificial intelligence, this has major implications for copyright law. On October 27, 2025, Judge Sidney Stein of the Southern District of New York rejected OpenAI’s motion to dismiss claims that ChatGPT output infringed the rights of authors such as George R.R. Martin and David Baldacci. The opinion suggests that brief summaries of popular works of fiction are very potentially infringing (unless fair use comes to the rescue).

As applied to works of fiction, it is a fundamental attack on the thought-expression distinction. This places thousands of Wikipedia entries in the copyright crosshairs and suggests that any type of summary or analysis of a fictional work is allegedly infringing.

In Penguin Random House LLC v. ColtingThe Southern District of New York found that the defendants’ “The Kinderguide” series, which condensed classic works of literature into children’s books, infringed copyrights in the original works despite being marketed as educational tools for parents to introduce young children to literature.

Every year, I ask students in my copyright class why there are children’s versions of classic novels. colting The same books were found to be infringing but would likely not have Wikipedia summaries of their plots. The recent ruling in the consolidated copyright cases against OpenAI means I may have to reconsider.

On October 27, 2025, Judge Stein of the Southern District of New York rejected OpenAI’s motion to dismiss output-based copyright infringement claims brought by David Baldacci, George R.R. Martin, and a class of other authors.

OpenAI had largely argued that the authors’ complaint failed to allege substantial similarity between any of their work and any of ChatGPT’s output. It is standard practice to attach a copy of the plaintiff’s work and the allegedly infringing work in a copyright lawsuit, but the court held that “the outputs submitted by plaintiffs in opposition to OpenAI’s motion were incorporated into the consolidated class action complaint by reference” and it was sufficient that their complaint repeatedly made “clear, definite and substantial references” to the outputs. Losing in that civil process skirmish was probably a bad sign for OpenAI – like the dreaded prologue in A Game of Thrones, you think copyright winter is coming.

Judge Stein then evaluated one of the more detailed Chat-GPT generated summaries related to George R.R. Martin’s 694-page novel A Game of Thrones, which eventually became the famous HBO series of the same name. Although this was simply a motion to dismiss, where the cards were stacked against the defendant, I was surprised at how easily the judge could conclude:

“A more discerning observer might easily conclude that this detailed summary is substantially similar to Martin’s original work, as the summary conveys the overall tone and feel of the original work while recapitulating the plot, characters, and themes of the original.”

The judge described the ChatGPT summary as follows:

“There have certainly been attempts to abridge or abridge some of the central copyrightable elements of the original works, such as the setting, plot, and characters”

He saw them as follows:

“Conceptually similar—though admittedly less detailed—to the plot summaries in Twin Peaks and Penguin Random House LLC v. Colting, where the district court found that works summarizing in detail the plot, characters, and themes of the original works were substantially similar to the original works.” (emphasis added).

That is to say, the GPT summary of A Game of Thrones at less than 580 words is “less detailed” than the 128-page Welcome to Twin Peaks guide. twin Peaks Matter, or various children’s books based on famous works of literature colting The matter has been stated in a slightly understated manner.

To see why the latest OpenAI decision is so surprising, it helps to compare the ChatGPT summary of A Game of Thrones to its counterpart Wikipedia plot summaryI read them both so you don’t have to,

The ChatGPT summary of Game of Thrones is approximately 580 words long and reflects the essential narrative of the novel. It covers all three major storylines: the political intrigue in King’s Landing that culminated in Ned Stark’s execution (spoiler alert), Jon Snow’s journey with the Night’s Watch at the Wall, and Daenerys Targaryen’s transformation from frightened bride (more on this shortly) into dragon mother across the Narrow Sea. In this respect, it’s a lot like the 800-word Wikipedia plot summary. Each summary presents the central conflict between the Starks and Lannisters, the revelation of Cersei and Jaime’s incestuous relationship, and major plot points that set the larger series in motion.

I could say more about their similarities, but I’m worried that if I explored the summaries in more detail, the Authors Guild might think I was infringing on George RR Martin’s copyright as well, so I’ll move on to the smaller differences.

The main difference between the Wikipedia summary and the GPT summary is structural. The Wikipedia summary takes a geographical approach, dividing the narrative into three distinct sections based on location: “In the Seven Kingdoms,” “On the Wall,” and “Across the Strait.” This structure mirrors the way that the novel follows different characters in different locations, to the extent that you begin to wonder if these characters will ever meet. In contrast, the GPT summary follows a more analytical structure, starting with relevant information about the setting and the series as a whole, then moving through sections that follow a roughly chronological progression through major plot points.

There are some minor differences. The Wikipedia summary provides more detailed plot descriptions and clear causal chains between events. For example, it tells how Tyrion’s arrest by Catelyn leads to Tywin’s retaliatory raid on the Riverlands, resulting in Robb’s need for a strategic alliance with House Frey to secure a vital bridge crossing. The Wikipedia summary also includes more secondary characters and subplots, such as Tyrion recruiting Bronn as his champion in a trial by battle, and Jon protecting Samwell Tarly.

The Wikipedia summary probably assumes greater familiarity with the fantasy genre, while the GPT summary may be more useful to the uninitiated. The GPT summary explains the significance of the long summer and impending winter and clearly sets out the major themes of the novel.

However, in broad strokes, there is very little daylight between these two summaries. They are remarkably similar in what they include and what they exclude. Most notably, both summaries sanitize Daenerys’s story by removing the sexual violence that is fundamental to her character arc. This is particularly striking because sexual violence is central to Martin’s story and the story of many of the main characters in several places.

I don’t understand how the ChatGPT summary could infringe copyright in George RR Martin’s novel, if the Wikipedia summary doesn’t. A scary prospect indeed, but I don’t think any of this is infringing.

It is absolutely true that you can infringe copyright in a novel by simply borrowing some of the main characters, plot points, and settings and creating a sequel or prequel. In copyright, we call this a derivative work. But just because sequels and children’s editions of novels are often infringing does not mean that a dry and concise analytical summary of a novel is infringing.

Why not? It’s really the act of taking those key structural elements, the skeleton of the novel if you like, and adding new elements to them to create a new fully realized work that makes an unauthorized sequel infringing.

Judge Stein’s order does not address the authors’ claims, not by much. And he was careful to point out that he was only considering the plausibility of an infringement charge, not any potential fair use defense. Still, I think it’s a troubling decision that greatly reduces the level of substantive equality.

the fact that “[w]Hein indicated, “ChatGPT can produce accurate summaries of books written by plaintiff and outline possible sequels to plaintiff’s books” falling far short of demonstrating that such output would be considered by the ordinary observer to resemble a fully realized novel.



Leave a Comment