GPTBot
GPTBot is OpenAI's web crawler used to fetch publicly accessible content for training and grounding future versions of GPT models.
GPTBot is the user agent string OpenAI uses when it crawls the web on behalf of GPT model training. It is one of several AI crawlers you may want to allow — or block — in your robots.txt.
User agent
User-agent: GPTBot
The full UA string is Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.x; +https://openai.com/gptbot).
Related OpenAI crawlers
OpenAI runs several distinct crawlers, each for a different purpose:
- GPTBot — broad training crawl
- OAI-SearchBot — powers ChatGPT Search results
- ChatGPT-User — fetches a URL on demand when a ChatGPT user clicks a link
Each can be controlled independently in your robots.txt. Blocking GPTBot does not stop OAI-SearchBot from indexing your site for live ChatGPT answers.
Should you allow GPTBot?
Allowing GPTBot helps your content land in future ChatGPT training corpora, which improves long-term brand recall in AI responses. Some publishers block it to protect proprietary content or negotiate licensing deals. The right answer depends on your business model — most marketing sites benefit from allowing it.
Example directives
# Allow training, allow search, allow user fetch
User-agent: GPTBot
Allow: /
User-agent: OAI-SearchBot
Allow: /
User-agent: ChatGPT-User
Allow: /
