AI search is a fundamentally different paradigm from traditional SEO. Content is either cited or it is not. There are no "page 2" results. The discipline of optimizing content for AI citation is called Generative Engine Optimization (GEO), a term coined by researchers at Princeton University.
The Princeton GEO Study
The foundational research on GEO was published by Aggarwal et al. at Princeton University, presented at ACM SIGKDD 2024. The key finding: including citations, quotations from relevant sources, and statistics can boost source visibility by over 40% across queries.
The top-performing methods:
- Statistics Addition: Adding relevant data points and numbers to content. 30-40% improvement on Position-Adjusted Word Count metrics.
- Quotation Addition: Incorporating credible quotes from authorities. 28% improvement on Subjective Impression metrics.
- Cite Sources: Including inline citations to reliable sources. 30-40% improvement.
These are high-impact, low-effort changes that require minimal content modifications.
How Each AI Platform Sources Content
Each major AI platform has distinct citation patterns:
ChatGPT (OpenAI)
- 76.4% of citations come from content updated in the last 30 days.
- Trusts what "the internet agrees on" — consensus-based authority.
- Cannot render JavaScript. Only sees content in the initial HTML response.
- Uses OAI-SearchBot for real-time search retrieval and GPTBot for training data.
Google Gemini
- Trusts what your brand says. Favors structured, factual content directly from brand domains.
- Heavily cites Reddit and Medium in addition to brand domains.
- Favors pages with schema markup, local landing pages, and consistent subdomains.
- Powered by Google's index, so traditional SEO signals carry significant weight.
Perplexity
- Averages 6.61 citations per response — the most citation-dense AI search engine.
- Trusts industry experts and customer reviews.
- Favors YouTube as a source format.
- Uses PerplexityBot for crawling.
Claude (Anthropic)
- Uses Claude-SearchBot for search indexing. Prioritizes domains with high informational value and frequent updates.
- Respects robots.txt directives including Crawl-delay.
Content Structure Best Practices
Based on GEO research and platform analysis:
- Use Q&A format, bullet points, how-to guides, and TL;DR summaries.: These formats help LLMs extract and present information.
- Include statistics with sources.: The single most effective optimization at +40% visibility.
- Add direct quotes from authoritative sources.: Quotation addition showed 28% improvement.
- Update high-priority content every 2-3 days.: Content under 3 months old is 3x more likely to be cited.
- Write long-form authoritative content.: Articles over 2,900 words average 5.1 citations versus 3.2 for articles under 800 words.
- Use semantic HTML, ARIA, and schema.: Machines should understand content without guessing.
- Ensure critical content is in initial HTML.: Not loaded via JavaScript — OpenAI's crawlers cannot render JS.
- Build third-party validation.: Reviews on G2, Reddit discussions, LinkedIn presence, and YouTube content all feed AI citation patterns.
robots.txt Configuration
If you block AI bots, you do not exist to them. Enterprise sites should ensure their robots.txt allows:
- GPTBot and OAI-SearchBot (OpenAI / ChatGPT)
- Google-Extended (Gemini)
- PerplexityBot (Perplexity)
- ClaudeBot and Claude-SearchBot (Anthropic / Claude)
- CCBot (Common Crawl, used by many LLMs)
Currently, GPTBot is the most blocked AI bot at 5.89% of all websites. Among major news publishers, 79% block AI training bots and 71% block AI retrieval bots.
Sources
- GEO: Generative Engine Optimization (Princeton / KDD 2024): https://arxiv.org/abs/2311.09735
- OpenAI Crawlers Overview: https://platform.openai.com/docs/bots
- Anthropic Crawler FAQ: https://support.claude.com/en/articles/8896518-does-anthropic-crawl-data-from-the-web-and-how-can-site-owners-block-the-crawler
- Perplexity Crawlers: https://docs.perplexity.ai/guides/bots
- Google — Succeeding in AI Search: https://developers.google.com/search/blog/2025/05/succeeding-in-ai-search