How to get cited by ChatGPT
To get cited by ChatGPT you have to be readable by the bot that fetches its sources: OpenAI's OAI-SearchBot, one of 3 separate OpenAI crawlers, and the one you can allow even if you block GPTBot for training (see OpenAI's bot documentation). The rule is plain: “if the bot can't read the page, it can't quote the page.” Allow the search bot, serve your answer in real HTML, and lead with a quotable, specific passage.
ChatGPT doesn't read your site. A crawler does.
When ChatGPT answers a question with live information, it does not load your website the way a person does. It sends a crawler to fetch the page, then quotes from whatever that crawler can read. OpenAI runs three of them, and they do different jobs:
- OAI-SearchBot fetches pages to cite in ChatGPT search results. This is the one that earns you a citation.
- ChatGPT-User fetches a page when a user explicitly asks ChatGPT to open a link.
- GPTBot collects content to train future models. It has nothing to do with being cited, and you can block it without losing search visibility.
The important part: these are independent. You can welcome the bot that cites you and refuse the bot that trains on you. If your robots.txt blocks everything from OpenAI to "keep the AI out," you have also blocked the crawler that puts you in the answer.
Step one: let the right crawler in
Open your robots.txt and make the split explicit. Allow the search and user bots, decide for yourself on the training bot:
# Let ChatGPT cite you in search results User-agent: OAI-SearchBot Allow: / User-agent: ChatGPT-User Allow: / # Optional: keep your content out of model training User-agent: GPTBot Disallow: /
If you want to be in the answers, the one line that is not optional is allowing OAI-SearchBot. Blocking it is the single most common reason a site never gets cited, and most owners do it by accident.
Step two: if it renders in JavaScript, it does not exist
Retrieval crawlers take the raw HTML your server sends. They generally do not run your JavaScript, wait for a framework to boot, and hydrate the page the way a browser does. So if your answer only appears after JavaScript runs, the crawler sees an empty shell and quotes a competitor whose answer is in the HTML.
This is the failure we catch most often. A site looks perfect in a browser and is nearly blank to a bot. Test it the honest way: load your page with JavaScript disabled, or view the raw source, and check that your actual answer is in there as text, not as an image and not injected later.
Step three: lead with the answer, not the throat-clearing
AI engines quote self-contained passages that answer the question directly. They reward specifics. The peer-reviewed GEO study (Princeton, KDD 2024) found that adding statistics, direct quotations, and cited sources measurably increased how often a page was surfaced in generative answers. So near the top of the page:
- Answer the question in the first paragraph, in plain language, before any preamble.
- Include a specific number or a named entity. Vague pages get skipped.
- Quote a source where it helps, and link out to the source you cite.
- Use a clear heading that matches the question a person would actually ask.
If the bot can't read the page, it can't quote the page.
What you do not need to do
There is a lot of folklore here, so be careful what you spend time on. Google published an official guide to optimizing for AI features and its position is blunt: there is no separate framework for AI search, the same fundamentals apply, and your page has to be crawlable and indexable before any of it matters. Google has also said on the record that it does not use llms.txt, so do not treat that file as a requirement. The work that pays off is the same work that has always paid off: be reachable, be readable without JavaScript, and be the most specific answer on the page.
Check your own page
You can do all of this by hand, or you can paste your link into Brimm and see in about 30 seconds whether OAI-SearchBot can read you, whether your answer survives without JavaScript, and how quotable your top passage is. We read your site the way the engines do and print the failures in fix order.