GPTBot

GPTBot is OpenAI's web crawler used to fetch publicly accessible content for training and grounding future versions of GPT models.

GPTBot is the user agent string OpenAI uses when it crawls the web on behalf of GPT model training. It is one of several AI crawlers you may want to allow — or block — in your robots.txt.

User agent

User-agent: GPTBot

The full UA string is Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.x; +https://openai.com/gptbot).

Related OpenAI crawlers

OpenAI runs several distinct crawlers, each for a different purpose:

  • GPTBot — broad training crawl
  • OAI-SearchBot — powers ChatGPT Search results
  • ChatGPT-User — fetches a URL on demand when a ChatGPT user clicks a link

Each can be controlled independently in your robots.txt. Blocking GPTBot does not stop OAI-SearchBot from indexing your site for live ChatGPT answers.

Should you allow GPTBot?

Allowing GPTBot helps your content land in future ChatGPT training corpora, which improves long-term brand recall in AI responses. Some publishers block it to protect proprietary content or negotiate licensing deals. The right answer depends on your business model — most marketing sites benefit from allowing it.

Example directives

# Allow training, allow search, allow user fetch
User-agent: GPTBot
Allow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: ChatGPT-User
Allow: /
SCORE: 00000LVL: 1
Full heartFull heartFull heart
Geosaur

GEOSAUR SURVIVAL

Don't let your brand go extinct in the new era of search. Collect credits with Geosaur and avoid meteors.

Left arrowRight arroworA keyD keyto move