You are here

Ranked: The Most Cited Websites by AI Models

See this visualization first on the Voronoi app.

Use This Visualization

Ranked: The Most Cited Websites by AI Models

This was originally posted on our Voronoi app. Download the app for free on iOS or Android and discover incredible data‑driven charts from a variety of trusted sources.

Key Takeaways

  • According to an analysis by Semrush, LLMs like ChatGPT reference Reddit and Wikipedia the most for facts
  • For geographical data, LLMs frequently cite Mapbox and OpenStreetMap

Where do large-language models (LLMs) like ChatGPT go to source factual information?

In this infographic, we rank the most cited websites by AI, based on a June 2025 analysis of over 150,000 LLM citations. It reveals how heavily chatbots rely on user-generated content, raising questions about the blind spots of today’s top AI tools.

Data & Discussion

The data for this visualization comes from Semrush. It shows how frequently AI models cite different domains when providing information, as of June 2025.

Rank Domain Citation frequency
1 reddit.com 40.1%
2 wikipedia.org 26.3%
3 youtube.com 23.5%
4 google.com 23.3%
5 yelp.com 21.0%
6 facebook.com 20.0%
7 amazon.com 18.7%
8 tripadvisor.com 12.5%
9 mapbox.com 11.3%
10 openstreetmap.com 11.3%
11 instagram.com 10.9%
12 mapquest.com 9.8%
13 walmart.com 9.3%
14 ebay.com 7.7%
15 linkedin.com 5.9%
16 quora.com 4.6%
17 homedepot.com 4.6%
18 yahoo.com 4.4%
19 target.com 4.3%
20 pinterest.com 4.2%

Risks of Relying on User‑Generated Content

Reddit leads the list with a citation frequency of 40.1%, followed by Wikipedia at 26.3%. This highlights how often LLMs lean on open-forum discussions and community-maintained content.

These domains offer a wealth of user-generated knowledge, but their open editing nature raises concerns about accuracy and bias. The high reliance signals that AI may amplify whatever narratives are most visible or popularly discussed—even if not always verified.

For example, users have reported that ChatGPT has suggested they purify their water with bleach, or even mix it with vinegar (this creates poisonous chlorine gas).

We summarize three major risks of relying on user-generated content below:

  • Misinformation and rumor propagation: Since content isn’t always moderated by domain experts, AI can inadvertently repeat incorrect or biased statements.
  • Echo-chamber amplification: Popular yet unverified narratives may get repeated if they gain traction, masking less visible but more accurate sources.
  • Lack of authority: Especially for consequential topics (health, law, finance), user‑generated sites lack the editorial oversight required for reliable guidance.

Learn More on the Voronoi App

If you enjoyed today’s post, check out How 21 Countries View Artificial Intelligence on Voronoi, the new app from Visual Capitalist.