Skip to content

robots.txt.liquid · AI bots

GPTBot, ClaudeBot, PerplexityBot, Google-Extended on Shopify

AI bots crawl Shopify stores by default. Per Shopify's verbatim Crawling-your-store doc: "Your store can be indexed by search engines and large language models (LLMs) without signatures"2. The only Shopify-native lever for changing this is robots.txt.liquid1. Four user-agent strings cover most of the 2026 traffic: GPTBot (OpenAI training), ClaudeBot (Anthropic training), PerplexityBot (Perplexity citation), and Google-Extended (Google AI opt-out without breaking Search). Blocking each has a real AI shopping cost — this article names it.

Published Verified 2026-05-22

The default: AI bots crawl unless you block them

Shopify's default robots.txt does not block AI training bots. GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and similar user-agents are allowed by default the same way bingbot and Googlebot are. The platform's stance is opt-out, not opt-in. To block any AI bot, the merchant must create robots.txt.liquid and add an explicit per-user-agent Disallow rule. This is consistent with Shopify's broader 'platform handles the safe defaults, merchant handles the strategic decisions' philosophy.

The implication for stores that have not touched robots.txt.liquid: every AI bot that respects robots.txt is currently crawling your storefront. Every product page, collection page, blog post, and policy page is being read by AI training crawlers and citation crawlers. For most Shopify stores in 2026 this is the desired outcome — AI shopping channels are the new traffic surface, and visibility there depends partly on the AI vendors having crawled the storefront.

The four AI crawler user-agents that matter in 2026

Four user-agents cover ~90% of AI crawler traffic on Shopify stores. (1) GPTBot — OpenAI's training crawler. Distinct from OAI-SearchBot (which powers ChatGPT search) and ChatGPT-User (the on-demand fetcher when a user clicks a citation). Blocking GPTBot does not block ChatGPT's ability to cite your store via OAI-SearchBot. (2) ClaudeBot — Anthropic's web crawler used for training. Honours robots.txt. (3) PerplexityBot — Perplexity's citation crawler. Used to discover and index pages cited in Perplexity's answers. (4) Google-Extended — Google's user-agent string for opting out of AI training (Bard, Vertex AI) without affecting Search indexing. This is the cleanest 'have it both ways' option in the matrix.

Worth understanding the OpenAI distinction in particular: OpenAI runs at least three crawlers. GPTBot is the training crawler (the one most "block AI training" guides target). OAI-SearchBot is the search-indexing crawler that powers ChatGPT's web search and shopping results. ChatGPT-User is the on-demand fetcher when a ChatGPT user clicks a link or a citation. Each is a separate user-agent string and each has different SEO/GEO implications. Blocking GPTBot but allowing OAI-SearchBot is the common compromise: opt out of training but stay visible in ChatGPT shopping3.

The decision matrix — block, allow, or selective

Three positions cover most stores. (1) Allow all — the default. Best for stores chasing AI shopping visibility (most Shopify stores in 2026). Catalog eligibility plus AI bot access maximizes the chance products surface in ChatGPT, Perplexity, Gemini, and Copilot. (2) Block training, allow shopping — block GPTBot, ClaudeBot, Google-Extended (training-focused user-agents) while leaving OAI-SearchBot, Googlebot, and PerplexityBot (citation-focused) allowed. Useful for brands sensitive to AI training but still wanting AI shopping visibility. (3) Block all — block every named AI user-agent. Almost never the right call on an ecommerce Shopify store; sometimes justified on editorial brands with proprietary content.

The honest assessment: position (1) is correct for ~95% of Shopify stores. The AI shopping channels are an emerging traffic stream the merchant cannot afford to opt out of, and the marginal damage of being in OpenAI's training corpus is small for ecommerce content (product descriptions, specifications, policies). Position (2) is a middle ground for brands that publicly advocate for AI opt-out as a brand stance. Position (3) is almost always the wrong call — it forfeits AI shopping visibility without meaningful upside.

How to add per-bot rules in robots.txt.liquid

Add per-bot Disallow rules below Shopify's default-groups loop in robots.txt.liquid. Never replace the default loop — that disconnects the store from Shopify's automatic default-rule updates. The pattern: keep the default loop intact, then append User-agent / Disallow blocks for each bot you want to block. Validate the rendered /robots.txt in a private browser window before pushing to live.

liquid robots.txt.liquid with custom AI-bot blocking rules
 {%- comment -%} Layer custom AI-bot rules on top of the default-groups loop {%- endcomment -%} {% for group in robots.default_groups %} {{- group.user_agent -}} {% for rule in group.rules %} {{- rule -}} {% endfor %} {% endfor %} {%- comment -%} Custom rules below the default loop {%- endcomment -%} User-agent: GPTBot Disallow: / User-agent: ClaudeBot Disallow: / User-agent: PerplexityBot Disallow: / User-agent: Google-Extended Disallow: / 

Per Shopify Dev Docs, the default-groups loop is the recommended starting point because "the default rules are updated regularly to ensure that SEO best practices are always applied." Layering custom rules below the loop preserves the update path; replacing the loop forfeits it.

The AI shopping cost of blocking

Blocking AI bots has a real, asymmetric cost in 2026. Shopify's Catalog feed reaches AI channels via two paths: (1) direct API integration (ChatGPT shopping, Perplexity shopping, Gemini, Copilot, Shop) and (2) the AI bots' independent crawl of your storefront for context enrichment. Blocking the bots breaks path (2) but not path (1) — your products still appear in AI shopping recommendations if eligible, but the AI engines have less context (no policy data, no editorial copy, no FAQ pages) to reason about them. The result: fewer citations, weaker product summaries, and reduced placement in conversational shopping flows.

The strategic frame: AI shopping visibility is not just about Catalog eligibility. The AI engines need to understand your store to recommend it well, and understanding comes from crawling the storefront content. Blocking the crawlers limits understanding without saving the products from the recommendation surface entirely. It's an asymmetric loss.

For the GEO-side complement to this article — what each AI bot does after crawling and how to influence the resulting citations — see /shopify-ai-search/ai-crawlers/. For the broader robots.txt.liquid cluster, see the hub.