The tech giant has pledged to ensure its AI training datasets do not include child sexual abuse content.
In collaboration with the nonprofit organization Thorn and All Tech Is Human, the companies Amazon, Anthropic, Civitai, Google, Meta, Metaphysic, Microsoft, Mistral AI, OpenAI, and Stability AI are creating a set of “Safety by Design” objectives. ” We are committed to the principles. In eradicating AI-generated CSAM.
“Just as the internet has accelerated sexual harm against children offline and online, the misuse of generated AI poses serious threats to child safety, including child harm, victim identification, and the spread of abuse.” ,” Thorne said in a statement.
“This misuse and associated downstream harm is already occurring within our own communities. Yet we must continue on the right path with generative AI to ensure that children are protected. We are in a rare moment, a window of opportunity, where we can guarantee that the technology will be built. ”
Companies must source training datasets responsibly, avoid or at least mitigate training data with known risks of containing CSAM, remove it from use, and report any confirmed CSAM to relevant authorities. Says.
They incorporate feedback loops and iterative stress testing techniques into their development processes, and use content provenance solutions to determine whether content was generated by AI. Both companies are also committed to protecting their generated AI products and services from abusive content and conduct and hosting their models responsibly.
They will invest in research and future technologies to prevent their services from expanding access to harmful tools and root out the use of generative AI for online child sexual abuse and exploitation. He says he will.
“These mitigations complement our existing efforts to prevent AI from creating, disseminating, and facilitating child sexual abuse and exploitation,” said Susan Jasper, vice president of trust and safety solutions at Google. Yes,” he said.
“We join our peers in this voluntary effort to make it as difficult as possible for bad actors to abuse generated AI to create content that depicts or depicts child sexual abuse. I'm proud to do it.”
He said the company integrates both hash matching and child safety classifiers to remove CSAM and other exploitative and illegal content from training datasets.
Last December, a study by the Stanford Internet Observatory revealed that hundreds of images from an open dataset of billions of images, known as LAION-5B, are used to train popular AI text-to-image generative models. Known CSAM images have been identified. The image URL was reported to the US National Center for Missing Children and the Canadian Child Protection Center and has now been removed.
However, the National Center for Missing and Exploited Children recently announced that it is struggling to keep up with the growing number of CSAM reports, which will increase by 12% in 2023 to a total of 36.2 million cases. Of these, 4,700 contained other sexually exploitative content related to CSAM or generated AI.
“GAI technology will not be trained on child sexual exploitation content, will be taught not to create such content, and GAI platforms will detect attempts to create child sexual exploitation content. “We need laws and regulations to ensure that those who create this content using their tools are responsible,” the Center said.