What is Google-Extended?
Google-Extended is not a crawler. It is a robots.txt control token that tells Google whether your content may be used to train future Gemini models and for grounding in Gemini and Vertex AI. Google's crawler documentation states 2 facts that kill most of the folklore. It has no separate user agent string of its own, and “Google-Extended does not impact a site's inclusion in Google Search nor is it used as a ranking signal in Google Search.” Blocking it does not remove you from Search, and it does not remove you from AI Overviews.
A control token, not a crawler
Most entries in a robots.txt file name a real program that visits your server. GPTBot visits. ClaudeBot visits. Google-Extended never does. Google's documentation is explicit: Google-Extended doesn't have a separate HTTP request user agent string, and crawling is done with existing Google user agent strings. You will never see "Google-Extended" fetch a page in your server logs, because there is no such fetcher.
What it is instead is a label Google's existing crawlers check against your robots.txt after the fact. The same Googlebot visit happens either way. The token only changes what Google is allowed to do with the content it already collected. Write User-agent: Google-Extended with a Disallow, and you are attaching a usage restriction, not closing a door.
What Google-Extended actually controls
Per Google's docs, the token lets web publishers manage whether content Google crawls from their sites may be used for training future generations of Gemini models, and for grounding in Gemini Apps and Grounding with Google Search on Vertex AI. Two distinct uses, one switch:
- Training. Whether your pages can feed the training of future Gemini models that power Gemini Apps and the Vertex AI API for Gemini.
- Grounding. Whether your content can be pulled in as source material when Gemini apps and Vertex AI ground their answers.
That is the full scope. It is a Gemini policy switch. It says nothing about whether Googlebot crawls you, whether you are indexed, or where you rank.
The myth: blocking it removes you from AI Overviews
This one deserves to be said plainly, because it costs site owners real decisions. Blocking Google-Extended does not remove you from Google Search, and it does not remove you from AI Overviews. Google's documentation says it outright: Google-Extended does not impact a site's inclusion in Google Search, nor is it used as a ranking signal.
AI Overviews are a Google Search feature. They are built on the same crawling and indexing that regular results use, which means Googlebot, not Google-Extended, is the relevant agent. The only robots.txt lever that takes you out of AI Overviews is the one that takes you out of Search entirely: blocking Googlebot. That trade is almost never worth it, and anyone selling "block Google-Extended to escape AI Overviews" or "allow Google-Extended to rank in them" is selling folklore. If your goal is to show up better in AI answers, the levers are crawlability, real HTML, and quotable passages, which is the territory our AI crawler access guide covers.
How to use it, if you choose to
The decision is narrow: do you want your content training and grounding Gemini or not. If not, the rule is 2 lines:
# Opt out of Gemini training and grounding. # Search indexing and AI Overviews are unaffected. User-agent: Google-Extended Disallow: /
If you leave it alone, nothing changes: your content remains available for those Gemini uses. Either way, audit the rest of the file while you are in it. The common damage we find is not a wrong Google-Extended line, it is a blanket rule nearby that blocks the crawlers that actually pay you back in citations. The full walkthrough is in how to fix robots.txt blocking AI crawlers, and the equivalent decision for OpenAI's bots is in GPTBot and robots.txt.
See also
Google-Extended is the odd one out in the AI-bot family: a token where everyone else runs a crawler. For the crawlers that do visit and do decide citations, read What is GPTBot? and What is Claude-SearchBot?. The pattern to internalize is that every AI control has a scope, and the scope is in the vendor's docs, not in the folklore.