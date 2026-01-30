LIVE TV
Home > Tech and Auto > How Did Amazon Detect So Much Child Sexual Abuse Material In Its AI Training Data?

How Did Amazon Detect So Much Child Sexual Abuse Material In Its AI Training Data?

Amazon detected hundreds of thousands of child sexual abuse cases in AI training data, raising concerns over data sourcing and safety.

Amazon detected hundreds of thousands of child sexual abuse cases. (Photo: X)
Amazon detected hundreds of thousands of child sexual abuse cases. (Photo: X)

Published By: Sofia Babu Chacko
Published: January 30, 2026 04:24:01 IST

Add NewsX As A Trusted Source

How Did Amazon Detect So Much Child Sexual Abuse Material In Its AI Training Data?

Amazon.com Inc. revealed that last year it detected hundreds of thousands of pieces of content in its AI training data that appeared to contain child sexual abuse material (CSAM). 

While the company removed the material before using it to train AI models, child safety officials say Amazon has not shared enough information about where the content came from, making it harder for law enforcement to protect victims and track down offenders.

How Amazon Found the Material?

Amazon uses an automatic scanning tool that compares content against a database of known CSAM, a process called hashing. 

According to the company, nearly all the reports came from non-proprietary training data obtained from external sources, like publicly available web content. 

The company also admitted it tends to over-report potential CSAM to avoid missing anything, which can lead to a high number of false positives.

A Dramatic Increase in Reports

The number of AI-related reports from Amazon jumped dramatically in 2025. The company accounted for most of over 1 million AI-related CSAM reports submitted to the National Center for Missing and Exploited Children (NCMEC), compared with just 67,000 reports from the rest of the tech industry the year before. 

Experts say this surge is an outlier, raising concerns about the source of the material and the safeguards in place during AI training.

Challenges for Law Enforcement

While Amazon is required to report suspected CSAM to NCMEC, the company has provided very little detail on where the content came from or who shared it, limiting the ability of authorities to remove the material or investigate offenders. NCMEC officials said that without these details, the reports are often “inactionable.” according to Bloomberg News.

AI Development and the Risks of Fast Data Collection

The spike in reports comes amid a fast-paced AI race, where companies are rapidly gathering large amounts of data to improve their models.

Experts warn that this speed increases the risk that exploitative material can enter AI training pipelines, and training AI on illegal content could unintentionally teach models to manipulate or sexualize images of children.

Amazon’s Response

Amazon said it is committed to preventing CSAM across all its businesses. A spokesperson emphasized that none of the flagged material was AI-generated, and the company’s AI models have not produced any CSAM. They also highlighted that Amazon’s tools scan training data carefully and remove known illegal content before it is used.

Industry Perspective

Other tech companies, including Google, OpenAI, Meta, and Anthropic, also scan AI training data for CSAM. But according to NCMEC, Amazon’s reporting is far higher than its peers, while providing much less information about the source of the material. Experts say this underscores the need for greater transparency and stronger safeguards in AI development.

Calls for Greater Transparency

Experts like David Thiel, former technologist at the Stanford Internet Observatory, say companies should be more open about where their AI training data comes from and how it is cleaned. Without transparency, there is always a risk that illegal material slips through, and children remain at risk of exploitation.

The discovery of hundreds of thousands of CSAM instances in Amazon’s AI training data highlights the challenges of developing AI responsibly.

While Amazon has systems in place to scan and remove illegal content, experts say more transparency, oversight, and safety measures are urgently needed to protect children and prevent AI from being trained on exploitative material.

First published on: Jan 30, 2026 4:24 AM IST
How Did Amazon Detect So Much Child Sexual Abuse Material In Its AI Training Data?

