The Most Important Thing to Understand About Schema in 2026

I want to start with a sentence that will save you from half the bad advice circulating right now: AI engines don't have a special JSON-LD parser. They read your schema as text on the page.

This was confirmed in late 2025 when independent researchers tested how ChatGPT, Claude, and Perplexity handled pages with structured data. The engines don't extract JSON-LD into a separate structured field. They concatenate it with the visible HTML, hand the whole blob to the model, and let the model figure out what's relevant. Search Engine Roundtable covered this in a widely-read piece and Google's own John Mueller confirmed a similar treatment at the SEO meetups in early 2026.

Why does this matter? Because the entire "add schema for AI" playbook changes. You're not feeding a typed database. You're writing prose — in a structured format — that gets reunited with your visible content. The implications:

Schema that contradicts the visible page is worse than no schema at all. The model sees two conflicting accounts and loses trust in both.
Schema that duplicates visible content helps, because it reinforces the key facts the model is trying to extract.
Schema that adds context the page lacks (e.g., marking up pricing that appears nowhere else on the page) gets heavily discounted, because the model weighs agreement across sources.
The 2019 practice of cramming every possible schema type onto every page is actively harmful in 2026. Irrelevant schema is noise, and AI engines can detect when the schema doesn't match the page's actual purpose.

Now that we've cleared the ground: here are the 12 schema types that actually move citation rates, ranked by how universally useful they are, with copy-paste JSON-LD for each. Implement the 3–5 that match your page. Skip the rest.

The citation math you're playing for

71% of ChatGPT-cited pages include structured data (2026 AI citation audits)
65% of pages cited in Google's AI Mode include structured data
~2.5x higher probability of appearing in an AI-generated answer for pages with correct, consistent schema vs. pages without
0x impact from schema that contradicts visible content, per internal tests across multiple engines

1. Organization Schema (Every Page, Root Domain)

If you implement only one schema type, make it this one. Organization schema tells AI engines who you are as a company — your name, logo, social profiles, and the key relationship signals they use when deciding whether to cite you.

Place this on your homepage at minimum, ideally sitewide via a layout template:

<script type="application/ld+json">

{

  "@context": "https://schema.org",

  "@type": "Organization",

  "name": "Roast.page",

  "alternateName": "roast.page",

  "url": "https://roast.page",

  "logo": "https://roast.page/icon.svg",

  "description": "Landing page analysis tool. Paste a URL, get an 8-dimension expert review in 30 seconds.",

  "foundingDate": "2026-01",

  "sameAs": [

    "https://x.com/roastpage",

    "https://www.linkedin.com/company/roastpage"

  ]

}

</script>

The sameAs array is the single most underrated field. It gives AI engines a disambiguation chain — "roast.page" might be confused with other things named roast, but if the sameAs points to verified Twitter and LinkedIn accounts, the entity is unambiguous. Add every official social profile you own.

Common mistakes

Using the full legal company name in name ("Roast.page, Inc.") when your public brand is the shorter form. Match the brand users would search.
Missing logo (AI engines use this to render your brand in shopping and comparison panels).
Multiple conflicting Organization markups on different pages. Pick one canonical definition.

2. WebSite Schema (Homepage Only, with Sitelinks Search)

A lightweight markup that does two jobs: declares the site's preferred name, and (optionally) exposes a search URL template so AI engines can deep-link users into your search. Only put this on the homepage.

<script type="application/ld+json">

{

  "@context": "https://schema.org",

  "@type": "WebSite",

  "name": "Roast.page",

  "url": "https://roast.page",

  "potentialAction": {

    "@type": "SearchAction",

    "target": {

      "@type": "EntryPoint",

      "urlTemplate": "https://roast.page/blog?q={search_term_string}"

    },

    "query-input": "required name=search_term_string"

  }

}

</script>

If your site doesn't have internal search, drop the potentialAction block entirely. Don't fake it by pointing to Google's site-search (that's a common anti-pattern that AI engines recognize as a gaming attempt).

3. Product Schema (Every Product or SaaS Pricing Page)

This is the single highest-leverage schema for anyone selling something. Product schema is the entry ticket for being included in AI-generated shopping and tool comparisons. Without it, you're effectively opting out of the "best X for Y" queries that Perplexity and ChatGPT now answer directly.

The required fields for being cited in comparisons are more specific than you might think. Google's Merchant Listing requirements are a good floor; AI engines are more forgiving about missing images but stricter about offer consistency.

<script type="application/ld+json">

{

  "@context": "https://schema.org",

  "@type": "Product",

  "name": "Roast.page Pro",

  "description": "Unlimited landing page analyses across all domains. 8-dimension expert review, AI-generated priority fixes, shareable reports.",

  "brand": {

    "@type": "Brand",

    "name": "Roast.page"

  },

  "image": "https://roast.page/og-default.png",

  "offers": {

    "@type": "Offer",

    "url": "https://roast.page/pricing",

    "priceCurrency": "USD",

    "price": "19.00",

    "priceSpecification": {

      "@type": "UnitPriceSpecification",

      "price": "19.00",

      "priceCurrency": "USD",

      "billingIncrement": 1,

      "unitText": "MONTH"

    },

    "availability": "https://schema.org/InStock"

  },

  "aggregateRating": {

    "@type": "AggregateRating",

    "ratingValue": "4.8",

    "reviewCount": "247"

  }

}

</script>

The rules nobody tells you

The price in the schema must match the price shown on the page. If your hero says "$19/mo" and your schema says "$29/mo", Google flags it as misleading and AI engines drop you from comparisons.
aggregateRating must be based on real, visible reviews. If your page doesn't display reviews, don't mark up ratings — this is the most common spam pattern and it's penalized.
SaaS is a Product. A lot of B2B companies think Product schema is only for physical goods. It isn't. SoftwareApplication (below) is more specific, but Product works as a base type.
One Product per page. If you have a pricing page with three tiers, use a single Product with multiple Offers, not three separate Product blocks. Multiple Product blocks confuse AI comparisons.

4. SoftwareApplication Schema (SaaS Homepages and Product Pages)

A more specific subtype of Product for software. Worth using in addition to or instead of generic Product for SaaS/apps. Adds fields that AI engines pull for tool comparisons (operating system, application category, feature list).

<script type="application/ld+json">

{

  "@context": "https://schema.org",

  "@type": "SoftwareApplication",

  "name": "Roast.page",

  "applicationCategory": "BusinessApplication",

  "applicationSubCategory": "Marketing Analytics",

  "operatingSystem": "Web",

  "description": "AI-powered landing page analysis. Paste a URL, get an 8-dimension expert review in 30 seconds.",

  "offers": {

    "@type": "Offer",

    "price": "0",

    "priceCurrency": "USD"

  },

  "featureList": [

    "8-dimension landing page scoring",

    "AI-generated priority fixes",

    "PageSpeed and Core Web Vitals integration",

    "Industry-aware analysis",

    "Shareable reports"

  ]

}

</script>

The featureList is one of the only places in schema where free-text matters for AI citations. These feature strings get pulled verbatim into comparison tables on Perplexity and the ChatGPT shopping interface. Write them the way you want to be quoted.

5. FAQPage Schema (The Single Highest-ROI Type for AI Answers)

Google officially deprecated most FAQ rich results in 2023, and a lot of SEO blogs took that as a reason to stop using FAQPage schema. That's the wrong lesson. AI engines still parse it, and they parse it eagerly — because FAQPage is already in the exact question-answer format that fine-tuned answer models were trained on.

If your landing page has a recurring-questions section, mark it up. If it doesn't, add one. FAQs are the highest-ROI schema you can implement for AI citations.

<script type="application/ld+json">

{

  "@context": "https://schema.org",

  "@type": "FAQPage",

  "mainEntity": [

    {

      "@type": "Question",

      "name": "How does roast.page analyze a landing page?",

      "acceptedAnswer": {

        "@type": "Answer",

        "text": "Roast.page takes a full-page screenshot, scrapes the HTML, runs Google PageSpeed Insights, then feeds everything to Claude. The model scores 8 dimensions — hero, copy, CTA, trust, design, flow, SEO, differentiation — and produces specific, prioritized fixes."

      }

    }

  ]

}

</script>

Rules for FAQ schema that actually gets cited

Real questions real users ask. Mine your support emails, Reddit threads about your space, and Google "People Also Ask" boxes. Fabricated questions rarely get cited.
First sentence of the answer is the whole answer. AI engines pull the first chunk. Front-load the complete answer, then add context.
80–250 words per answer. Shorter and the model can't extract useful context. Longer and it gets chunked mid-sentence.
Questions must be on the page, not just in schema. The schema has to mirror visible FAQ content. Ghost-FAQs are spam.

6. Article Schema (Every Blog Post and Long-Form Page)

For any piece of editorial content — blog posts, guides, research reports — Article schema is how you tell AI engines "this is a specific, dated, authored piece of content you can cite with confidence." The critical field for AI is datePublished: engines discount undated content when picking between sources.

<script type="application/ld+json">

{

  "@context": "https://schema.org",

  "@type": "BlogPosting",

  "headline": "Schema Markup for AI Search: The 12 Types That Actually Move Citations in 2026",

  "description": "The 12 schema types that influence citations in ChatGPT, Claude, and Perplexity — with copy-paste JSON-LD.",

  "datePublished": "2026-04-21",

  "dateModified": "2026-04-21",

  "author": {

    "@type": "Organization",

    "name": "Roast.page",

    "url": "https://roast.page"

  },

  "publisher": {

    "@type": "Organization",

    "name": "Roast.page",

    "logo": {

      "@type": "ImageObject",

      "url": "https://roast.page/icon.svg"

    }

  },

  "wordCount": 3600,

  "articleSection": "SEO",

  "keywords": "schema markup, AI search, AEO, GEO"

  }

</script>

Use BlogPosting for blog posts and NewsArticle for timely news. Use Article as the generic fallback. ScholarlyArticle exists but AI engines don't weight it specially — don't use it unless you're actually publishing academic work.

The dateModified trap

If you update an article, update dateModified honestly. Do not mass-update dateModified across your archive to "refresh" content — AI engines cross-check the modification date against actual content change, and freshness-gamed pages get discounted. It's one of the few schema games that's clearly losing in 2026.

7. HowTo Schema (Step-by-Step Tutorials)

Deprecated for rich results by Google in 2023, but still one of the most extractable formats for AI engines. If your content is genuinely a step-by-step tutorial — installation, recipe, configuration — HowTo is high-leverage. If your content is a listicle pretending to be a tutorial, skip this.

<script type="application/ld+json">

{

  "@context": "https://schema.org",

  "@type": "HowTo",

  "name": "How to add AI-aware channel groups in GA4",

  "totalTime": "PT15M",

  "step": [

    {

      "@type": "HowToStep",

      "position": 1,

      "name": "Open GA4 Admin → Data Display → Channel Groups",

      "text": "In your GA4 property, click the admin gear in the bottom left, then navigate to Data Display, then Channel Groups."

    }

  ]

}

</script>

Each step should have a name (short, scannable) and a text (full explanation). AI engines often pull the names to build a numbered list, then pull one or two of the texts for detail.

8. BreadcrumbList (Every Non-Homepage)

Small, boring, high-ROI. BreadcrumbList tells AI engines where a page sits in your site's hierarchy — which helps them contextualize the page during citation and sometimes adds a "part of the [parent category] series" note in the answer.

<script type="application/ld+json">

{

  "@context": "https://schema.org",

  "@type": "BreadcrumbList",

  "itemListElement": [

    { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://roast.page" },

    { "@type": "ListItem", "position": 2, "name": "Blog", "item": "https://roast.page/blog" },

    { "@type": "ListItem", "position": 3, "name": "Schema Markup for AI Search", "item": "https://roast.page/blog/schema-markup-ai-search-2026" }

  ]

}

</script>

9. Review and AggregateRating

Marked up within a Product, LocalBusiness, or SoftwareApplication. AI engines weight aggregate ratings heavily in comparison queries ("best X for Y"). The threshold for being included in an AI-generated top-5 is usually ≥4.0 stars with ≥10 reviews. Below that, you're typically omitted even if your content is strong.

"aggregateRating": {

  "@type": "AggregateRating",

  "ratingValue": "4.8",

  "reviewCount": "247",

  "bestRating": "5",

  "worstRating": "1"

}

Individual reviews can also be marked up with full Review schema, but aggregateRating is the one that moves citations. Do not mark up ratings you don't actually display on the page. AI engines cross-check.

10. LocalBusiness (Brick-and-Mortar or Service-Area Businesses)

If you operate locally, this is non-negotiable. AI engines pull LocalBusiness data directly into location-aware queries ("best dentist near me", "coffee roaster in Louisville"). Without it, you might as well not exist for these queries — Google Maps data fills the gap, but you lose the "cited source" credit that drives organic follow-up.

<script type="application/ld+json">

{

  "@context": "https://schema.org",

  "@type": "LocalBusiness",

  "name": "Your Business Name",

  "image": "https://yoursite.com/storefront.jpg",

  "address": {

    "@type": "PostalAddress",

    "streetAddress": "123 Main St",

    "addressLocality": "Louisville",

    "addressRegion": "KY",

    "postalCode": "40202",

    "addressCountry": "US"

  },

  "telephone": "+1-502-555-0100",

  "url": "https://yoursite.com",

  "priceRange": "$$",

  "openingHoursSpecification": [

    {

      "@type": "OpeningHoursSpecification",

      "dayOfWeek": ["Monday","Tuesday","Wednesday","Thursday","Friday"],

      "opens": "09:00",

      "closes": "18:00"

    }

  ]

}

</script>

Use a more specific type if one applies: Restaurant, Dentist, LawFirm, AutoRepair, etc. The specific subtypes trigger category-specific answer modes in some engines.

11. Person Schema (Author Pages, Founder Pages)

Underused, especially by founders. AI engines are increasingly building entity graphs of people, and having a Person schema on your about page or author bio page connects you as an entity that can be cited on behalf of your company. This matters when AI answers quote you by name.

<script type="application/ld+json">

{

  "@context": "https://schema.org",

  "@type": "Person",

  "name": "Varun Khanna",

  "jobTitle": "Founder",

  "worksFor": {

    "@type": "Organization",

    "name": "Roast.page"

  },

  "url": "https://roast.page/about",

  "sameAs": [

    "https://x.com/varunkhanna",

    "https://www.linkedin.com/in/varunkhanna"

  ]

}

</script>

As with Organization, the sameAs array is the most important field — it's the disambiguation signal. Without it, "Varun Khanna" is ambiguous. With verified social links, the entity is clear.

12. VideoObject (Any Page With Embedded Video)

Worth doing for any landing page that includes a demo video, product walkthrough, or explainer. AI engines increasingly pull video transcripts and titles into their answers, and VideoObject schema tells them what the video is about before they run heavier transcription.

<script type="application/ld+json">

{

  "@context": "https://schema.org",

  "@type": "VideoObject",

  "name": "Roast.page 90-Second Product Demo",

  "description": "Quick walkthrough of the roast.page analysis flow: paste URL, wait 30 seconds, see the 8-dimension report.",

  "thumbnailUrl": "https://roast.page/demo-thumb.jpg",

  "uploadDate": "2026-03-10",

  "duration": "PT1M30S",

  "contentUrl": "https://roast.page/demo.mp4"

}

</script>

If the video is hosted on YouTube, you can usually skip your own VideoObject schema — YouTube provides it. But if it's self-hosted, add it. Muxed video, Loom embeds, Wistia embeds: add it.

The Three Schemas That Probably Don't Help You

To save you time, a brief negative list. These types show up in a lot of guides but rarely move AI citation rates in 2026:

Event schema unless you actually run events. If you do, it's essential. If you're using it for "webinars" or "product launches" that aren't real scheduled events, you're adding noise.
JobPosting schema unless you're a job board or actively hiring. AI hiring assistants pull from specialized job indexes, not general schema.
Course schema unless you sell courses. SaaS "course" pages are usually better marked up as Articles or HowTos.

Implementing these when they don't match your actual content is the exact pattern that AI engines flag as schema stuffing. Resist.

Testing: Three Steps Before Shipping

Never deploy schema without testing. The three checks I always run:

Google's Rich Results Test: search.google.com/test/rich-results. Pastes your URL, renders a preview. If Google's parser errors out, every AI engine will too.
Schema.org Validator: validator.schema.org. Strict W3C-style validation. Catches typos Google's tool ignores.
The "read it as prose" test: Copy your JSON-LD into a plain-text editor, strip the braces, and read it. Does it accurately describe your page? If it sounds like it's describing a different page, or if it includes claims not visible in your content, rewrite it. This is the single best test for the 2026 "schema must agree with visible page" rule.

What schema does your page have right now?

Run your page through roast.page. The Technical & SEO dimension checks your existing schema markup, flags missing types for your page category, and scores how well the structured data agrees with your visible content. Takes about 30 seconds, free.

The Implementation Order That Works

For a new site, here's the sequence I recommend based on what moves citation rates fastest:

Week 1: Organization + WebSite schema sitewide. This is table stakes and costs nothing.
Week 1: BreadcrumbList on every non-homepage page. Usually a one-line template change.
Week 2: The one schema type most specific to your revenue page — Product or SoftwareApplication for SaaS, LocalBusiness for local, Article for content sites.
Week 2: FAQPage on your most-trafficked pages if they have an FAQ section. Add a FAQ section to your homepage and pricing page if they don't — the citation ROI is high enough to justify the content work.
Month 2: Article/BlogPosting on all content pages. Usually a template change.
Month 2: aggregateRating on product pages, assuming you have real reviews to back it.
Month 3: Person schema for founder and key author pages. Cheap and compounds.
As needed: VideoObject, HowTo, LocalBusiness based on what your pages actually contain.

Notice what's not on this list: a schema audit. You don't need a consultant or a tool to tell you what to do. Ninety percent of the citation lift comes from getting the first four types right on your highest-trafficked pages. Everything else is refinement.

One Last Thing Nobody Talks About

Schema is not a substitute for content. It's a multiplier on good content. A page with perfect schema and a vague value proposition will still get ignored by AI engines, because the underlying text doesn't give the model anything useful to extract. A page with excellent content and zero schema will still get cited sometimes, because the model can extract from prose just fine.

The 2.5x citation lift from schema is real, but it's 2.5x on top of whatever base rate your content already earns. If your content isn't good enough to be cited without schema, schema alone won't save it. Fix the content first. Then add the markup. Then watch the citations compound.

Every page you ship should answer three questions in its visible text before you even think about JSON-LD: what it is, who it's for, and why it matters. Once those are clear in the prose, schema is the amplifier. Not before.

Schema Markup for AI Search: The 12 Types That Actually Move Citations in 2026