Every page on the web contains information. Structured data is what turns that information into facts an agent can act on. Without it, agents guess from unstructured HTML. With it, facts are labeled, typed, and unambiguous.
The Format That Won
Structured data uses the Schema.org vocabulary[1], embedded as JSON-LD in a script block in the page head. Google explicitly recommends JSON-LD over Microdata and RDFa[2]. JSON-LD won because it lives separate from your visible HTML, which means you can add, update, or generate it without touching your templates.
Why Agents Need It
Without structured data, an agent reading your product page has to guess what "$149" means. Is it the current price? A sale price? A shipping fee? With Product schema, the agent knows the exact price, currency, availability, rating, and brand with certainty. That certainty is the difference between your product being recommended accurately and being skipped entirely.
Google reports that pages with valid structured data see up to 30% higher click-through rates in rich results[2]. For agents, the impact is even more binary: either the agent has the facts or it doesn't.
The Schema Types That Matter Most
- Organization is table stakes. Every site should declare its name, logo, URL, and contact info[3]. Without it, agents may confuse your company with similarly named entities.
- Product is where structured data earns its keep: name, image, brand, SKU, offers (price and availability), and aggregate ratings[4]. If you sell anything online and don't have Product schema, agents are guessing about your catalog.
- Article markup (headline, author, datePublished, publisher) tells agents whether your content is current and authoritative. A 2024 article without datePublished looks the same as a 2018 article to an agent.
- FAQPage turns question-and-answer content into rich results and gives agents structured Q&A pairs they can cite directly[5].
- BreadcrumbList communicates your site hierarchy, helping agents understand where a page sits in your overall structure.
How to Validate
Two tools cover the bases. The Google Rich Results Test[6] confirms what Google can consume, while the Schema.org Validator[7] checks full spec compliance. Run both. Google's test only validates the subset it supports; the Schema.org validator catches issues Google ignores but other agents may not.
Common Mistakes
The most common failure is missing required properties. A Product schema without an offers block is technically invalid, and many consumers will ignore it. Google's documentation lists required vs recommended properties for every type.
Mismatched content between JSON-LD and visible HTML is another common problem. If your JSON-LD says "$149" but the page shows "$129" after a sale, agents and search engines flag the inconsistency. Some will penalize it.
Stale data is equally dangerous. If your structured data is hardcoded and not regenerated when prices, availability, or dates change, agents serve outdated facts confidently, which is worse than no data at all.
How Site Scanner Helps
Site Scanner checks for structured data presence and validity as part of its Discoverability audit, flagging missing schemas, invalid properties, and mismatches with visible content.