Merriam-Webster, the leading English dictionary publisher, alongside its parent company Encyclopedia Britannica, has launched a legal battle against OpenAI, the creator of the popular AI chatbot ChatGPT. The lawsuit accuses OpenAI of illegally using copyrighted material to train its AI model, effectively “free-riding” on the dictionary’s intellectual property.
Core Allegations: Unauthorised Copying and Output Reproduction
The core of the complaint centres on the claim that OpenAI scraped over 100,000 articles, encyclopedia entries, and dictionary definitions from online sources without permission. This data was then used to train ChatGPT, enabling it to generate responses that directly replicate or closely mimic the original copyrighted content.
According to the lawsuit, OpenAI violates copyright in three critical ways:
1. Massive-scale copying of protected materials.
2. Using this content for AI training.
3. Generating outputs that are too similar to the original text.
Traffic Diversion and AI Hallucinations
Merriam-Webster argues that ChatGPT’s ability to summarise dictionary definitions and other content cannibalises traffic from its own website, depriving the publisher of revenue. Moreover, the lawsuit claims that ChatGPT sometimes produces “AI hallucinations” – fabricated responses generated when the AI lacks sufficient information – using the dictionary’s data as a deceptive base.
The complaint further asserts that ChatGPT frequently presents incomplete or inaccurate explanations by selectively omitting portions of the dictionary’s content, misleading users in the process.
Legal Demands and Implications
The plaintiffs are seeking financial compensation for the alleged copyright infringement and a permanent injunction to prevent OpenAI from continuing these practices.
The case is significant because it tests the boundaries of fair use in AI training. If successful, the lawsuit could establish a precedent that forces AI developers to obtain explicit permission before using copyrighted materials in their models, potentially reshaping the future of AI development. OpenAI has yet to respond to the lawsuit.
This legal clash highlights the growing tension between intellectual property rights and the rapid advancement of AI technologies. The outcome will likely set a key standard for how copyrighted materials can be used in the training of large language models.



















