The digital landscape is undergoing a fundamental transformation as generative artificial intelligence reshapes how information is discovered, processed, and cited. According to newly released data from Muck Rack, journalism and earned media have solidified their positions as the primary foundation for citations within Large Language Models (LLMs) like ChatGPT, Claude, and Gemini. The report, titled "What Is AI Reading," provides a comprehensive look at the ecosystem of content that fuels modern AI search engines, revealing that a staggering 99% of links cited by these platforms originate from non-paid media sources. This shift marks a significant departure from traditional search engine optimization (SEO) and signals the rise of Generative Engine Optimization (GEO) as the new frontier for public relations and corporate communications.
The Hierarchy of AI Citations and Source Preferences
The Muck Rack data clarifies that not all AI chatbots are created equal; each platform demonstrates distinct "preferences" for specific types of data repositories. This divergence suggests that the training methodologies and real-time retrieval systems of these AI models are tuned toward different institutional pillars.
ChatGPT, developed by OpenAI, continues to lean heavily on Wikipedia for its foundational knowledge and citations. As a general-purpose tool, the reliance on a peer-reviewed, community-edited encyclopedia provides a layer of baseline factual consistency. In contrast, Claude, the AI assistant developed by Anthropic, shows a marked tendency to pull information from PubMed Central. This indicates a preference for high-authority, scientific, and academic literature, positioning Claude as a more research-oriented tool in the eyes of its developers.
Meanwhile, Google’s Gemini has taken a different path, frequently utilizing Reddit as its go-to source for conversational data and real-time public opinion. This alignment follows high-profile licensing agreements between Google and the social media platform, highlighting how corporate partnerships are directly influencing the "personality" and information-sourcing habits of AI agents.
The Dominance of Journalism and Real-Time Information
Perhaps the most critical finding for the media industry is that journalism remains the single largest category contributing to AI information. Accounting for 27% of all content found on AI platforms, journalistic output outpaces all other forms of content creation. The report further notes that among journalism citations with known publication dates, 57% were published within the last 12 months. This underscores the reliance of LLMs on "fresh" data to provide relevant answers to user queries.
This reliance on recent journalism creates a symbiotic, yet tension-filled, relationship between tech giants and news organizations. While AI models depend on the rigorous fact-checking and reporting of journalists to remain accurate, the shift in traffic patterns threatens the traditional ad-supported business models of many publishers. The data suggests that for a brand to appear in an AI-generated answer, it must first secure coverage in a reputable news outlet, effectively making earned media more valuable than it has been in decades.
The Diminishing Returns of Paid Content and Press Releases
In a stark contrast to the success of earned media, the Muck Rack report highlights the near-irrelevance of paid and advertorial content in the AI search ecosystem. Paid content accounts for a mere 0.3% of all citations found on AI platforms. This suggests that the algorithms governing AI responses are increasingly sophisticated at filtering out "pay-to-play" content, prioritizing objective reporting and third-party validation over self-promotional materials.
Equally surprising is the performance of the traditional press release. Once the cornerstone of the PR industry, press releases represent just 1.1% of citations. This data point suggests that while press releases may still serve as a functional tool for alerting journalists to news, they no longer act as a direct-to-consumer or direct-to-AI information source. The AI models appear to prefer the synthesized analysis found in a news article over the raw, often biased data presented in a corporate announcement.
A Chronology of the Search Evolution
To understand the current state of AI search, one must look at the timeline of digital discovery. For twenty years, the industry was dominated by Google’s PageRank algorithm, which prioritized backlinks and keyword density.
- 2010–2020: The "Content is King" era focused on high-volume blog production and SEO-driven keywords to capture organic search traffic.
- 2022 (November): The launch of ChatGPT-3.5 introduced the public to generative responses, shifting the focus from a list of links to a singular, synthesized answer.
- 2023–2024: Major AI developers began signing data-sharing agreements with publishers (e.g., Axel Springer, News Corp) and social platforms (Reddit), formalizing the pipeline of information.
- 2025–2026: The emergence of GEO (Generative Engine Optimization). As traditional search traffic began to dwindle, PR professionals shifted their focus to "mention-based" visibility within AI summaries.
The current May 2026 data reflects the culmination of these trends, where the quality of the source—rather than the quantity of the keywords—determines a brand’s visibility in the AI-driven marketplace.
Regulatory Pressure and the SEC’s New Role
The shift in AI sourcing is occurring alongside a tightening regulatory environment. The Securities and Exchange Commission (SEC) has recently unveiled new rules that may fundamentally change the PR game, particularly regarding how public companies disclose information. These rules require greater transparency and faster reporting of material events, which in turn feeds the AI "beast."
As AI models scrape SEC filings and corporate disclosures, the 24% of AI citations attributed to corporate blogs and official content becomes a critical battleground. The SEC’s focus on preventing "greenwashing" and misleading AI-generated financial summaries means that PR professionals must ensure their corporate content is not only AI-readable but also strictly compliant with new disclosure standards. The intersection of AI scraping and regulatory oversight means that a single discrepancy in a corporate blog could be amplified across multiple AI platforms instantaneously.
Industry Reactions and the Practitioner’s Dilemma
The reaction from the PR community has been a mix of optimism and anxiety. Many practitioners report a "new bounce in their step," as the value of earned media is objectively proven by the Muck Rack data. However, the stakes have never been higher.
"GEO is making earned media hot again, but the media landscape itself is in a state of flux," says one industry analyst. "Because everyone is reading this data, every PR firm is now pitching the same high-authority outlets like The New York Times, The Guardian, or niche leaders like PubMed. The competition for a dwindling number of journalist ‘slots’ is becoming unsustainable."
The Guardian’s editor-in-chief recently weighed in on this phenomenon, suggesting that newsrooms must revisit their past principles of high-integrity reporting to take on the present challenges of AI. The sentiment among editors is that while AI uses their content, the value lies in the human element—the investigative depth that an LLM cannot replicate but desperately needs to cite for credibility.
Strategic Implications for the Future of Communications
The data from Muck Rack serves as a wake-up call for brands that have relied on "spray and pray" distribution methods. To succeed in an AI-dominated search environment, the report suggests several strategic pivots:
- Prioritize High-Authority Earned Media: Since journalism accounts for the lion’s share of citations, securing placement in reputable publications is the most effective way to influence AI responses.
- Focus on Longevity and Recency: With 57% of citations coming from the last year, a "one and done" media strategy is no longer viable. Continuous engagement is required to stay relevant in the AI’s active memory.
- Optimize Corporate Content for Utility: Since corporate blogs still account for 24% of citations, this content should be written as authoritative, factual resources rather than marketing fluff.
- Niche Targeting: Understanding that Claude prefers PubMed while Gemini prefers Reddit allows PR pros to tailor their strategies. A healthcare brand should focus on clinical whitepapers to influence Claude, while a consumer brand should engage in community-building to influence Gemini.
Conclusion: The New Partnership Between PR and Journalism
The evolution of AI search has effectively positioned PR professionals as partners to journalists and Large Language Models alike. The practitioners who will succeed in this new era are those who move away from transactional pitching and toward the creation of compelling, fact-based stories that serve a dual audience: the human reader and the machine learning algorithm.
As the SEC continues to monitor the accuracy of corporate communications and as AI models become more discerning about their sources, the premium on truth and third-party validation will only continue to rise. The "Scoop" is no longer just about getting the news out first; it is about ensuring that when the AI searches for the truth, your brand is part of the verified record. In the world of 2026, visibility is no longer bought through ads or forced through press releases—it is earned through the rigorous, enduring power of journalism.







