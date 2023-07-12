A sweeping class action lawsuit against Google was filed Tuesday alleging that the company scraped millions of people’s personal information and copyrighted material to train its AI tools, including its Bard chatbot. Among the data allegedly scraped are information gathered on children of all ages, the suit alleges.

Clarkson Law, a public interest legal firm, filed the federal class action lawsuit in the Northern District of California against Google, its parent company Alphabet, and Google’s AI subsidiary DeepMind.

Clarkson filed a similar lawsuit against OpenAI last month to do with its chatbot, ChatGPT. The suits are a first volley of objections to the use of personal data on the internet being used to train lucrative AI tools. Vast quantities of data generated by people are used to train chabots from Bard to ChatGPT and these are what gives the bots their human-like conversational abilities (mistakes included).

“We have only recently learned that Google has been taking everything ever created or shared online by millions of internet users, including all our personal information, creative works, and professional works, and using all of that data to train and build commercial AI Products,” said Ryan Clarkson, a managing partner at Clarkson, in a statement provided to The Messenger.

“Google harvested this data in secret for years, without providing notice to anyone, much less with anyone’s consent,” Clarkson added. “Google does not own the internet, it does not own our creative works, it does not own our expressions of our personhood, pictures of our families and children, or anything else simply because we share it online.”

The suit centers on a recent change in Google’s privacy policy, which made explicit the company uses public information to train its AI models and tools. "We’ve been clear for years that we use data from public sources — like information published to the open web and public datasets — to train the AI models behind services like Google Translate, responsibly and in line with our AI Principles," Google General Counsel Halimah DeLaine Prado said in a statement to The Messenger. "American law supports using public information to create new beneficial uses, and we look forward to refuting these baseless claims.”

The lawsuit alleges Google’s AI “uses stolen private information, including personally identifiable information, from hundreds of millions of internet users,” and alleges that Google continues to unlawfully “collect and feed additional personal data from millions of unsuspecting consumers worldwide, far in excess of any reasonably authorized use, in order to continue developing and training” its AI products.

The lawsuit goes on to allege that such actions violate state and federal privacy, property, and consumer protection laws.

“All of the stolen information belonged to real people who shared it online for specific purposes, not one of which was to train large language models to profit Google while putting the world at peril with untested and volatile AI products,” said Timothy K. Giordano, a partner at Clarkson Law, said in a statement provided to The Messenger. “‘Publicly available’ has never meant free to use for any purpose.”

In the lawsuit, Clarkson seeks two remedies on behalf of their clients. The first is instituting a temporary freeze on further commercial use of Google’s AI tech until guardrails can be enacted. Secondly, the suit seeks financial compensation, or “data dividends,” for each person whose data was allegedly “commercially misappropriated” to train Google’s AI tools.