Briefly
- By mid-2025, 35% of newly printed web sites had been AI-generated or AI-assisted, up from zero earlier than ChatGPT’s November 2022 launch.
- The confirmed results are semantic contraction and synthetic positivity—not misinformation or stylistic homogeneity, regardless of what most individuals consider.
- At 35% AI prevalence, mannequin collapse threat shifts from a theoretical concern to an empirical one for the subsequent technology of basis fashions.
A brand new examine has a quantity for a way a lot of the web is now AI-generated: 35%. That is the share of newly printed web sites categorized as AI-generated or AI-assisted by mid-2025, in accordance with analysis from Stanford College, Imperial Faculty London, and the Web Archive. The determine was primarily zero earlier than ChatGPT launched in November 2022.
“I discover the sheer velocity of the AI takeover of the net fairly staggering,” Jonáš Doležal, researcher at Imperial Faculty London and co-author of the paper, advised 404 Media. “After many years of people shaping it, a good portion of the web has turn out to be outlined by AI in simply three years.”
The examine, titled “The Impression of AI-Generated Textual content on the Web,” drew on 33 months of web site snapshots from the Web Archive’s Wayback Machine and used an AI textual content detector referred to as Pangram v3 to categorise every web page.
The confirmed harms: vibes, not details
Researchers examined six hypotheses about what AI content material does to the net. Solely two held up underneath knowledge scrutiny.
The primary: We’re turning right into a horde of dumb NPCs appearing in the identical approach… Or extra scientifically put, the net is turning into much less semantically numerous.
AI-generated websites confirmed pairwise semantic similarity scores 33% increased than human-written ones. The identical concepts maintain getting expressed in almost the identical methods.

The paper suggests the web Overton window could also be narrowing, not by means of censorship or coordinated campaigns, however as a result of language fashions optimize for outputs near their coaching distribution.
The second: The net is getting aggressively cheerful.
AI content material confirmed optimistic sentiment scores greater than 107% increased than human content material. Researchers tie this to the well-documented sycophantic tendencies of LLMs—skilled on human approval indicators, they produce textual content that feels sanitized, friction-free, and relentlessly upbeat.
An web flooded with cheerful, homogenized content material could marginalize human dissent at scale with out anybody pulling a lever.

Regardless of widespread public perception, the examine discovered no statistically important proof that AI content material is making the web much less factually correct. Researchers discovered no significant correlation between AI prevalence and factual error price.
The stylistic monoculture speculation—AI flattening particular person voices right into a generic uniform register—was the idea respondents held most strongly (83% agreed). The info did not affirm it. Character-level evaluation discovered no statistically important improve in stylistic homogeneity tied to AI prevalence.
The mannequin collapse drawback simply obtained actual
The broader stakes transcend discourse high quality. At 35% AI prevalence, the theoretical threat of mannequin collapse—the place future fashions degrade after coaching on AI-generated knowledge—shifts from educational concern to empirical actuality. Future basis fashions skilled on up to date net crawls will inevitably ingest knowledge that’s considerably AI-generated and measurably much less semantically numerous.
The staff is now working with the Web Archive to show the examine right into a steady, reside monitoring device, monitoring AI’s share of the net in actual time moderately than as a one-off snapshot.
A U.S. survey carried out alongside the examine discovered most Individuals already consider all six destructive hypotheses, together with those the info does not help. Individuals who use AI sometimes had been 12% extra more likely to consider within the harms than frequent customers. Useless Web Concept believers, meet the info: The web is not useless, however 35% of what is new might be zombie content material in a roundabout way.
Each day Debrief Publication
Begin on daily basis with the highest information tales proper now, plus authentic options, a podcast, movies and extra.
