ClaudeBot is Anthropic's web crawler. If it can't read your site, Claude can't learn about your brand.
One Crawler, Two Jobs
ClaudeBot crawls the public web for both training data and real-time retrieval[1]. Unlike OpenAI, which splits crawling across three separate bots, Anthropic uses a single crawler for everything. Block it and you lose both training and live answers.
What It Can Read
ClaudeBot is a plain HTTP fetcher. It reads headings, body text, JSON-LD structured data, image alt text, and anchor text from static HTML. It does not execute JavaScript, so content rendered client-side with frameworks like React, Vue, or Angular is invisible to it.
robots.txt Control
Block ClaudeBot entirely: ``` User-agent: ClaudeBot Disallow: / ```
Or allow everything (the default when no rule exists): ``` User-agent: ClaudeBot Allow: / ```
Common Mistakes
- Accidentally blocking ClaudeBot with a `User-agent: *` catch-all rule. Most robots.txt files have one, and it silently prevents Claude from ever seeing your content.
- Relying on JavaScript to render key content. If your product descriptions, pricing, or FAQs only exist inside a client-side component, ClaudeBot will never see them.
- Using `User-agent: claude` instead of `User-agent: ClaudeBot`. The string must match exactly or the rule is ignored.
Making Your Site Readable
ClaudeBot only sees what the server sends in the initial HTML response. Your most important content, including product names, pricing, and company description, needs to exist as static HTML rather than hydrated client-side.
JSON-LD structured data gives ClaudeBot explicit entity relationships it can parse without guessing. An llms.txt file goes further by providing the crawler a machine-readable map of what your site is and what matters most.
How Site Scanner Helps
Site Scanner evaluates your site against the signals ClaudeBot uses: static HTML content, structured data, robots.txt configuration, and page speed. See also How Agent Crawlers Work.