Skip to content

AI Crawlers

GPTBot on Shopify: what it indexes, how to allow it

Published

What GPTBot is and what it does on Shopify

GPTBot is OpenAI's training crawler. It collects data that may be used to train future versions of OpenAI's models. It is not the bot that powers ChatGPT search citations today — that's OAI-SearchBot. It is not the bot that fetches a URL when a user references it in ChatGPT — that's ChatGPT-User. Allowing GPTBot affects future model training; blocking it does not stop ChatGPT from quoting your store via the other two bots.

OpenAI's bot documentation1 names all three bots and clarifies the role of each. Shopify's crawling-your-store doc4 confirms that the store is reachable by LLMs without signatures.

Training vs retrieval — the distinction

Training bots like GPTBot, ClaudeBot, and Google-Extended influence the next-generation model. The data they collect feeds the training set. Retrieval bots — OAI-SearchBot, ChatGPT-User, Claude-User, Claude-SearchBot, PerplexityBot — influence today's AI answers. They fetch live pages during a conversation or build a live search index. For a Shopify merchant, retrieval bot access matters for current AI visibility; training bot access matters for next-year visibility.

The implication: blocking GPTBot in 2026 has a slow, hard-to-measure effect. It may show up in 2027-2028 model behavior. Blocking OAI-SearchBot in 2026 has an immediate, measurable effect — ChatGPT search stops citing your store. Treat the two as different decisions.

Allowing or blocking GPTBot on Shopify

The control is robots.txt.liquid. To allow GPTBot, the file should not contain a Disallow rule under GPTBot's user-agent block. To block GPTBot, add a Disallow rule. To allow GPTBot only on specific paths, add granular Disallow rules. Shopify's editing-robots help page documents the syntax2.

The most common 2024-era mistake was a wildcard block of 'all AI bots' that disabled GPTBot, ClaudeBot, AND retrieval bots like OAI-SearchBot — erasing the store from AI search. The fix is targeted Disallow per bot, not a wildcard.

GPTBot audit checklist

Five checks. The decision is more philosophical than technical — do you want OpenAI's next-generation models to know about your store, or not? Most stores want yes; some prestige brands intentionally opt out of training corpus.

  1. Open theme code editor. Look for templates/robots.txt.liquid. If absent, defaults apply (GPTBot allowed).
  2. If the file is present, search for 'GPTBot' (case-sensitive). Note any Disallow rules.
  3. Confirm OAI-SearchBot and ChatGPT-User are NOT blocked (these are the retrieval bots).
  4. Decide: do you want OpenAI training models to see your store? Yes = leave GPTBot allowed. No = add targeted Disallow.
  5. Verify by fetching the live robots.txt: curl https://yourstore.com/robots.txt and confirming the rules match your intent.