AEO guide

How to get cited by Perplexity

The Brimm team · grounded in the docs

To get cited by Perplexity you have to be readable by the bot that fetches its sources. Perplexity runs 2 main crawlers, and PerplexityBot is the one that indexes pages so they can be cited, so you must allow it in your robots.txt. The rule is plain: “a source you can't crawl is a source you can't cite.” Allow PerplexityBot, serve your answer in real HTML, and lead with a specific, current passage. The peer-reviewed numbers back this up, see the GEO study (Princeton, KDD 2024).

Perplexity doesn't read your site. A crawler does.

When Perplexity answers a question, it does not load your website the way a person does. It sends a crawler to fetch the page, then quotes from whatever that crawler can read. Perplexity runs two main bots, and they do different jobs:

PerplexityBot fetches and indexes pages so they can be surfaced and cited in answers. This is the one that earns you a citation.
Perplexity-User fetches a page when a user acts on it directly, for example by following or sharing a link inside Perplexity.

Perplexity shows its sources prominently, as numbered citations next to the answer. That changes the whole game. You are not fighting for a blue link below the fold. You are trying to be one of the handful of numbered sources the answer is built from. Being readable, specific, and current is the entire job. If your robots.txt blocks PerplexityBot to "keep the AI out," you have also blocked the crawler that puts you in the answer.

Step one: let the right crawler in

Open your robots.txt and allow Perplexity's bots explicitly. This is the line most owners get wrong, usually by accident:

# Let Perplexity index and cite you
User-agent: PerplexityBot
Allow: /

# Let Perplexity fetch links users open or share
User-agent: Perplexity-User
Allow: /

If you want to be eligible for a citation, the one line that is not optional is allowing PerplexityBot. Blocking it is the single most common reason a site never gets cited. It often happens because someone pasted a "block all the AI bots" snippet from a forum without reading what each bot actually does.

Step two: if it renders in JavaScript, it does not exist

Retrieval crawlers take the raw HTML your server sends. They generally do not run your JavaScript, wait for a framework to boot, and hydrate the page the way a browser does. So if your answer only appears after JavaScript runs, the crawler sees an empty shell and cites a competitor whose answer is already in the HTML.

This is the failure we catch most often. A site looks perfect in a browser and is nearly blank to a bot. Test it the honest way: load your page with JavaScript disabled, or view the raw source, and confirm that your actual answer is in there as text. Not as an image, not injected after the page loads. If you cannot find your own sentence in the source, neither can PerplexityBot.

Step three: lead with the answer, not the throat-clearing

Perplexity favors fresh, well-structured, factual content, and it shows the sources it leans on. So write the page to be quoted. The peer-reviewed GEO study (Princeton, KDD 2024) found that adding statistics, direct quotations, and cited sources measurably increased how often a page was surfaced in generative answers. Near the top of the page:

Answer the question in the first paragraph, in plain language, before any preamble.
Include a specific number or a named entity. Vague pages get skipped.
Quote a source where it helps, and link out to the source you cite.
Use clear, question-style headings that match what a person would actually ask.
Keep it current. Stale pages lose to fresher, more specific ones.

A source you can't crawl is a source you can't cite.

What you do not need to do

There is folklore here too, so be careful where you spend time. Perplexity indicated support for llms.txt in 2025, and you can publish one if you want. But llms.txt is not required to be cited and it is not a ranking factor, so do not treat it as the thing that gets you into the answer. The work that pays off is the same work that has always paid off: be reachable, be readable without JavaScript, and be the most specific, current answer on the page. A clean robots.txt and a server-rendered answer will do more for your citations than any optional file ever will.

Check your own page

You can do all of this by hand, or you can paste your link into Brimm and see in about 30 seconds whether PerplexityBot can read you, whether your answer survives without JavaScript, and how quotable your top passage is. We read your site the way the engines do and print the failures in fix order.