Background
This multidistrict litigation consolidates copyright claims by named authors and rightsholder entities, including The Authors Guild, against OpenAI and Microsoft. The consolidated complaint alleges that defendants copied plaintiffs’ books and fed them into ‘large language models’ designed to generate human-like text responses, and that these systems sit at the centre of defendants’ commercial products. It pleads that the models were trained by reproducing a large corpus of copyrighted material, and that training is framed as copying expression so that the models can ‘memorize, mimic, and paraphrase’ that expression. (#183)
AI interaction
The pleaded AI pathway is training and deployment of the GPT family of large language models powering ChatGPT and related services, with the complaint alleging training on ‘hundreds of thousands of books’ and linking that process to the capacity to generate text calibrated to mimic human writing. It pleads specific provenance allegations about book corpora described as ‘Books1’ and ‘Books2’, stating that OpenAI has admitted those datasets were sourced and downloaded from ‘LibGen’, and further alleging that OpenAI admitted using books downloaded from LibGen to train its models. The complaint also pleads output behaviour as evidence, including that ChatGPT ‘generally responds’ to requests for quotations with ‘I can’t provide verbatim excerpts from copyrighted texts.’ and that this shift is attributed to being ‘restrained’ by programmers rather than a loss of underlying capability. An Opinion and Order in the MDL discovery record treats training data handling itself as an evidence object, stating it is ‘undisputed’ that OpenAI deleted the Books1 and Books2 datasets in 2022 and that OpenAI asserted at the time that the datasets were deleted due to ‘non-use’, with the order then addressing whether communications about the deletion must be produced and how privilege claims apply. (#183, #846)
Notes
On 27 January 2026, Judge Stein dismissed OpenAI’s Rule 72(a) objection ‘as moot’ in light of ‘the parties’ resolution of this dispute’. (#1184)