
A U.S. federal judge has ruled that a copyright infringement lawsuit filed by a group of authors against artificial intelligence startup Cohere can proceed. The decision from the Northern District of California marks another key development in the wave of litigation aiming to define the legal boundaries of training generative AI models.
In the ruling, U.S. District Judge William Orrick denied Toronto-based Cohere's motion to dismiss the proposed class-action lawsuit. The authors allege that the company illegally used their copyrighted books to train its large language models (LLMs), which are designed to generate human-like text. The complaint argues that Cohere used a controversial dataset known as "The Pile," which allegedly contains works from illegal "shadow libraries."
This decision is particularly noteworthy as it comes from the same judge overseeing similar cases against other major AI developers. In related lawsuits, Judge Orrick has partially granted motions to dismiss certain claims. For instance, in cases brought by artists and authors against Stability AI, Midjourney, and OpenAI, the court dismissed several claims while allowing others, trimming the scope of those lawsuits. This has created a complex legal environment where the specific details of how each AI model is trained are under intense scrutiny.
Unlike in those prior decisions, Judge Orrick found the authors' allegations against Cohere plausible enough to move forward to the discovery phase. This step will allow the plaintiffs' legal team to seek more direct evidence about the specific data Cohere used for training its systems. The cases are part of a broader legal challenge from creators across industries who claim their intellectual property was misappropriated to build powerful AI tools that could ultimately compete with them.
As one of OpenAI's primary competitors, backed by major investors like Oracle and Nvidia, the outcome of this lawsuit could have significant financial and operational implications for Cohere. The ruling underscores the ongoing legal uncertainty facing the entire generative AI industry, as courts grapple with how traditional copyright law applies to cutting-edge technology. The collective results of these lawsuits are expected to set crucial precedents for the future development and deployment of AI.



