The AI firm Anthropic, based in the US, has achieved a significant settlement concerning authors whose literary works were utilized for training its AI models without authorization. Announced in late September, this agreement mandates Anthropic to compensate authors whose writings were unlawfully taken from pirated sources like LibGen and Pirate Library Mirror (PiLiMi) with $1.5 billion.
Anthropic, known for the Claude series of large language models (LLMs), allegedly incorporated around 7 million books during training. However, payments will be disbursed solely to authors of about half a million of those works. To qualify, works must fulfill certain conditions, including specific registrations and legitimate identifiers, such as ISSN or ASIN numbers.
Eligible authors can anticipate approximately $3000 for each qualifying title and can confirm their works against an approved list to file claims. Unclaimed money will be reallocated to claimants in proportion, or if it is minimal, potentially assigned to a pertinent non-profit organization. The Authors Guild, key in developing the claim procedure, is a plaintiff in a prominent copyright infringement case against OpenAI, with similar lawsuits against other AI firms surfacing.
Dylan Ruediger from Ithaka S+R posits that other AI companies likely trained on pirated content. Although publishers have initiated licensing deals with AI firms, the push for author compensation when their works are used continues, given the uncertain legal landscape surrounding LLMs. Ruediger emphasizes that the significant settlement amount illustrates shared interests in formalizing usage agreements.
According to the settlement conditions, Anthropic is obligated to remove all copies of infringing books and is committed to making legal acquisitions in the future, as any further violations could lead to comparable legal actions. Chemistry World has reached out to Anthropic for additional comments.