SeriesFusion
Science, curated & edited by AI
Nature Is Weird  /  AI

Millions of websites are now just AI agents talking to other AI agents, and this machine-made content is already dominating search results.

Empirical evidence from Common Crawl and Bing now confirms that the dead internet theory is a measurable reality. LLM-dominant websites are spreading rapidly and often outrank human-generated content in major search engines. This trend marks a permanent shift in how information is created and consumed on the open web. The internet is transitioning from a platform for human expression to a massive feedback loop for synthetic data. Practitioners must now account for the fact that most training data for future models will likely be generated by previous AI versions.

Original Paper

DeGenTWeb: A First Look at LLM-dominant Websites

Sichang Steven He, Calvin Ardi, Ramesh Govindan, Harsha V. Madhyastha

arXiv  ·  2605.00087

Many recent news reports have claimed that content generated by large language models (LLMs) is taking over the web. However, these claims are typically not based on a representative sample of the web and the methodology underlying them is often opaque. Moreover, when aiming to minimize the chances of falsely attributing human-authored content to LLMs, we find that detectors of LLM-generated text perform much worse than advertised. Consequently, we lack an understanding of the true prevalence an