AI Search Engines Favour Less Popular Sources, Research Reveals

TL;DR: Research from Ruhr University and the Max Planck Institute reveals AI-powered search engines cite significantly less popular websites compared to traditional search results. Over half of Google AI Overview sources don’t appear in the top 10 organic search results, with 40% falling outside the top 100 entirely.

Generative AI search engines are fundamentally changing which websites receive visibility in search results, according to new research comparing traditional Google search links with AI-powered alternatives including Google AI Overviews, Gemini search, and GPT-4o’s web search modes.

Research Methodology and Scope

Researchers from Ruhr University in Bochum, Germany, and the Max Planck Institute for Software Systems analysed search results across multiple AI platforms and traditional Google search. The study, “Characterizing Web Search in The Age of Generative AI,” drew test queries from diverse sources including ChatGPT’s WildChat dataset, AllSides political topics, and Amazon’s 100 most-searched products.

The research compared source popularity using Tranco domain rankings, examining whether cited websites appeared in traditional search results and measuring their overall web popularity.

Key Findings on Source Selection

Domain Popularity Patterns:

  • AI search engines cite sources with lower Tranco rankings than traditional top 10 results
  • Sources more likely to fall outside both top 1,000 and top 1,000,000 tracked domains
  • Google Gemini search showed strongest tendency toward unpopular domains, with median sources falling outside Tranco’s top 1,000

Overlap with Traditional Results:

  • 53% of Google AI Overviews sources don’t appear in top 10 Google organic results for same query
  • 40% of AI Overview sources fall outside top 100 traditional Google links entirely
  • Similar patterns observed across GPT-4o search implementations

Source Type Characteristics

AI-powered search engines demonstrated distinct preferences in source selection:

Preferred Sources:

  • Corporate entities and official organisational websites
  • Encyclopaedic resources and knowledge bases
  • Authoritative domain-specific publications

Avoided Sources:

  • Social media platforms (almost never cited by GPT-based searches)
  • User-generated content sites
  • Forum discussions and community platforms

Content Coverage and Synthesis

LLM-based analysis revealed AI search results cover similar numbers of identifiable “concepts” as traditional top 10 links, suggesting comparable detail and diversity. However, researchers identified significant differences in information presentation:

Compression Effects:

  • AI engines “compress information, sometimes omitting secondary or ambiguous aspects”
  • Traditional search “provides better coverage” for ambiguous terms
  • Names shared by different people showed particularly noticeable differences

Pre-trained Knowledge Integration:

  • AI engines combine cited web sources with internal training data
  • GPT-4o with Search Tool frequently provides direct responses without web citations
  • Approach offers advantage for established knowledge but limitations for timely information

For trending queries from Google’s September 15 list, GPT-4o with Search Tool frequently responded with “could you please provide more information” rather than searching for current data, highlighting challenges in determining when web search is necessary.

Looking Forward

The research team emphasised their findings don’t determine whether AI search is objectively “better” or “worse” than traditional methods. Instead, they call for “new evaluation methods that jointly consider source diversity, conceptual coverage, and synthesis behaviour in generative search systems.”

This shift in source selection has significant implications for website visibility, SEO strategy, and the broader information ecosystem. Businesses relying on traditional search rankings may need to reconsider their content strategies as AI-powered search gains adoption.

The research raises important questions about information authority, source credibility verification, and whether citing less popular sources improves or degrades search quality for users.

Source Attribution:

Share this article