PerplexityBot: how Perplexity crawls the web
PerplexityBot is Perplexity's web crawler. Unlike many AI bots, its main purpose is to surface content in live answers, not to train models.
2026-06-19
·
1 min read
PerplexityBot
PerplexityBot is Perplexity’s web crawler. Unlike many AI bots, its main purpose is to surface content in live answers, not to train models. It is the most “search-engine-like” of the AI crawlers.
PerplexityBot is among the most important AI crawlers to allow. Perplexity is heavily used by knowledge workers and developers, and a citation there drives highly qualified traffic.
How to control PerplexityBot
- Allow. Default. Lets Perplexity fetch your content for live answers
- Block. Add to robots.txt:
User-agent: PerplexityBot Disallow: / - Block only training. Perplexity uses a separate user agent for training data collection. Check their docs for the latest user agent name
Why allow it
- PerplexityBot is what powers the citations in Perplexity answers
- Citations in Perplexity drive AI referral traffic from a high-value audience
- It is the cleanest signal of “the model read your page for this answer”
Why block it
- You do not want your content surfaced in Perplexity’s answers
- You have a paywall or content licensing restrictions
- You are in a regulated industry
How to verify
- Check your server logs for
User-Agent: PerplexityBot - Use Perplexity’s published IP ranges to confirm
- Search Perplexity for your top buyer questions and see which URLs get cited