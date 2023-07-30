The list of authors and artists asking for compensation from AI companies that scrape the internet for content is growing bigger by the day.

Prolific writer James Patterson, celebrated author Margaret Atwood, comedian Sarah Silverman, and the social media site Reddit have all pushed back against AI firms in recent weeks.

Most concerning among creatives is “generative AI,” which powers chatbots and AI image generators and uses online data to answer questions, write short essays and stories, and create images.

The large language models behind generative AI search the internet and mimic the findings. More and more creators now say that mimicry crosses the line into copyright infringement.

One of the first shots was fired in January when a group of artists sued AI companies like MidJourney and DeviantArt, which create still images, for using their work without permission to generate artistic images.

In Mid-April Twitter and Reddit announced they would restrict third-party access to site data.

Reddit co-founder and CEO Steve Huffman told the New York Times it was necessary because large language models were tapping user-generated content without returning any value to the social network.

“More than any other place on the internet, Reddit is a home for authentic conversation,” he said. “There’s a lot of stuff on the site that you’d only ever say in therapy, or AA, or never at all … But we don’t need to give all of that value to some of the largest companies in the world for free.”

In a lawsuit filed earlier this month, Silverman and authors Christopher Golden and Richard Kadrey said ChatGPT and AI created by Facebook parent company Meta were able to summarize their work — including the comedian’s memoir “The Bedwetter” — with such intimate detail that it must have accessed pirated copies on so-called “shadow libraries” filled with copyrighted material.

“The defendants, by and through the use of ChatGPT, benefit commercially and profit richly from the use of plaintiffs’ and Class members' copyrighted material,” the suit says.

The suit — filed in the United States District Court for the Northern District of California in San Francisco — says that the plaintiffs did not consent to the use of their work in the data AI companies use to train large language models.

Open AI is being accused of using copyrighted material to train ChatGPT without permission. Oliver Morin/AFP via Getty Images

The case is strikingly similar to a suit filed in early July in the same district court by two lesser-known authors who claim content generated by ChatGPT too closely resembles their work.

Two weeks ago thousands of writers, including Patterson and Atwood, endorsed an open letter to Open AI, ChatGPT’s parent company, asking for compensation when their work was used in the AI-generated content.

Legal action against AI companies faces an uphill battle. Earlier this month a San Francisco judge said he would most likely dismiss the artists' suit against MidJourney and DeviantArt.

“I don't think the claim regarding output images is plausible at the moment because there's no substantial similarity” between the images generated by those programs and the artists’ work, Federal Judge William Orrick said, according to Reuters.

And a lawsuit accusing Google of violating copyright by using snippets of books from online databases at University libraries was dismissed by the Supreme Court in 2013.

Open AI and other tech firms operating generative AI programs have released little information on the data they use to train their language models, which Silverman’s team considers a de-facto admission that it illegally used a copy of her book.

“As far as we know, the other side hasn’t denied it,” Joseph Saveri, one of Silverman’s lawyers, told the Associated Press. “They don’t have an alternative explanation for this.”

It remains to be seen whether the plaintiffs in the authors’ cases can prove large language models used their work and not the untold numbers of publicly accessible user reviews.

MidJourney and DeviantArt have asked that the artists' suit be dismissed.

Open AI and Meta have not yet responded to the suits.