Point11
  • Demo
  • Pricing
  1. Home
  2. Learn
Discoverability

How Gemini Crawlers Work

Google runs two crawler families: Googlebot for search and Google-Extended for Gemini. Blocking the wrong one can silently remove you from AI answers.

Your robots.txt file may be hiding your site from Gemini, and you would never know it from your search rankings. Google runs two distinct crawler families, and a single misplaced directive will quietly cut you off from one without affecting the other.

Two Crawler Families, One Domain

Google operates a family of crawlers[1]:

  • Googlebot indexes pages for Google Search
  • Google-Extended handles AI training and Gemini real-time answers[2]
  • AdsBot evaluates landing page quality for Google Ads
  • APIs-Google serves Google APIs and internal products

Google-Extended was introduced in September 2023 as a dedicated token for AI training access, independent of search indexing.

How They Differ

Googlebot crawls for search relevance, while Google-Extended crawls for comprehension. It trains Gemini's models and powers AI Overviews. The key distinction is that they respond to different robots.txt directives. A rule targeting Googlebot does not apply to Google-Extended, and vice versa.

How to Control Access

Block only Google-Extended (keep search, opt out of AI): `` User-agent: Google-Extended Disallow: / ``

Block all (removes from both search and AI): `` User-agent: * Disallow: / ``

Common Mistakes

  • Accidentally blocking Google-Extended via a catch-all User-agent: * rule without realizing it
  • Assuming that blocking Google-Extended only affects training, when it also affects Gemini's real-time answers
  • Using noindex thinking it only affects search (it affects all Google crawlers)

How Scanner Helps

Scanner audits your robots.txt for unintentional Google-Extended blocks. See also How Agent Crawlers Work.

Sources

  1. 1.Google: Overview of crawlers
  2. 2.Google: Google-Extended

See how your site scores.

Run a free scan at point11.ai to check your How Gemini Crawlers Work and 40+ other metrics.

Scan Your Site

More from Learn

Discoverability

How Agent Crawlers Work

Agents use specialized crawlers to read the web. Understanding how GPTBot, ClaudeBot, and others work helps you stay visible where it matters most.

Discoverability

Structured Data Is Your Site's API for Agents

Structured data turns page content into labeled facts that agents can act on with certainty, instead of guessing from raw HTML.

Discoverability

What Is llms.txt

llms.txt is a proposed standard that gives agents a clean, structured document instead of raw HTML, serving as a front door for agentic browsing.

Point11

Analytics

  • SignalYour share of voice.
  • ScannerSee how agents see.
  • BenchmarksCompetitive views.
  • JourneysLive agents on site.

Infrastructure

  • SiteOptimized for agents.
  • ChatYour data, your edge.
  • VoiceNavigate by voice.
  • AdsAgent powered campaigns.

Insights

  • Blog
  • Case Studies
  • Podcast
  • Learn
  • Benchmarks

Company

  • About
  • Careers
  • Contact
  • Partners

Industries

  • Automotive
  • Education
  • Energy
  • Financial Services
  • Government
  • Healthcare
  • Insurance
  • Legal
  • Manufacturing
  • Media
  • Real Estate
  • Retail
  • Technology
  • Travel

Demo

  • Site Platform
  • DemoShopRetail
  • DemoBankFinance
  • DemoGovGovernment

Pricing

  • Pricing
© 2026 Point11 · Patent Pending
© 2026 Point11 · Patent PendingLegalPrivacyTermsSystem Status
System Status

Google Crawlers

Blocking Google-Extended stops AI training while keeping search indexing intact

Googlebot

User-agent: Googlebot

Crawls pages for Google Search indexing. Blocking this removes your site from Google search results entirely.

User-agent: Googlebot Allow: /
AllowedAllowedBlocked
Search indexingRich snippetsAI training

Google-Extended

User-agent: Google-Extended

Fetches content for Gemini AI model training and Vertex AI. Independent from search — blocking it has no effect on your Google rankings.

User-agent: Google-Extended Disallow: /
BlockedBlockedBlocked
Gemini trainingVertex AISearch indexing

Key insight: These crawlers are independently controllable. You can block Google-Extended to opt out of AI training while keeping Googlebot allowed for search.