Schema Markup and AI Citations: What the Data Shows

There is a stat doing the rounds right now. Pages cited by AI were almost three times more likely to have JSON-LD than non-cited pages. It sounds definitive. It gets screenshot, shared, and turned into a LinkedIn carousel within 48 hours. The implication is obvious: add schema, get cited. Except that is not what the data actually shows.

New research tracked 1,885 pages that added schema markup and measured what happened to their AI citation rates. The answer, bluntly, is not much. The correlation between having JSON-LD and being cited by AI systems is real. The causal relationship between adding it and getting cited is far weaker. That distinction matters enormously if you are making decisions about where to invest your GEO effort.

Why the Correlation Looks So Convincing

When you look at the population of AI-cited pages and compare them to non-cited pages, the structured data gap is substantial. Nearly three times more likely to have JSON-LD is a striking difference. The instinct is to treat this as a signal you can act on directly. If cited pages have schema, put schema on your pages.

But this reasoning confuses the profile of a successful page with the cause of its success. Pages that are authoritative, well-structured, frequently linked to, and published by credible sources are also the pages most likely to have been built by teams who implement schema as a matter of course. The schema is part of the same technical discipline that produces strong content - it is not the independent variable driving citation.

This is not a new problem in digital marketing. We have been untangling correlation from causation in SEO for over a decade. Pages that rank tend to be longer, have more images, and load faster. That does not mean adding words, images, and speed to a weak page will push it to position one. The same logic applies here, and it is worth being honest about that with clients before they start a schema implementation project they believe will transform their AI visibility.

What Actually Happens When You Add Schema to Existing Pages

The 1,885-page tracking study is valuable precisely because it is longitudinal. It does not just compare cited to non-cited pages - it watches what happens after schema is added. And the finding is that AI citation rates barely moved. That should recalibrate how practitioners talk about structured data in the context of GEO and AEO work.

Adding schema to a page that AI systems would not otherwise cite does not appear to change their assessment of that page. This makes a certain amount of sense when you consider how systems like Google AI Overviews, ChatGPT, Perplexity, and Gemini actually select sources. They are evaluating trustworthiness, depth of information, clarity of language, topical authority, and the broader reputation of the publishing domain. Structured data is not a shortcut past those fundamentals.

Schema does still matter for other reasons - rich results in traditional search, better entity understanding, clearer communication of page type and content. Those are legitimate technical SEO benefits. The mistake is treating them as equivalent to AI visibility gains, or suggesting that a schema audit is a GEO strategy in itself.

The AI Citation Signals That Actually Hold Up

If schema is not moving the needle on AI citations, what is? The honest answer is that the research base is still developing, but patterns are emerging. AI systems consistently favour pages that answer specific questions directly and completely, are written with clear authorial voice and expertise, come from domains that are cited by other trusted sources, and contain information that can be extracted and presented cleanly without additional inference.

That last point is subtle but important. AI citation is partly a function of how easily your content can be used. Dense, jargon-heavy prose that buries the answer three paragraphs in is harder for a generative system to work with than a page that states the core answer upfront and then supports it with context. This is a content structure and editorial question, not a technical markup question.

Brand authority signals also appear consistently in cited content. Pages from established, well-linked domains with clear publishing standards get picked up more reliably. For UK brands building a GEO programme, this points toward a longer-term investment in domain authority, consistent publishing cadence, and third-party mentions - rather than a one-time technical intervention.

How This Changes the GEO Work We Recommend

None of this means schema should be removed from a GEO checklist. It should still be implemented correctly, kept up to date, and used appropriately for the content type. What changes is its priority relative to other activities, and the claims made about what it will achieve.

At Digiconomy, we treat structured data as baseline hygiene - necessary but not differentiating. The differentiating work is further up the stack: content depth, topical clustering, building genuine third-party citation through PR and partnerships, and ensuring that owned content actually answers the questions that AI systems are being asked. That is where time and budget produce measurable change in citation frequency.

Clients understandably want quick wins when they first engage with AI visibility work. Schema can feel like one - a defined task with a clear output. The job of a good strategist is to be direct about the limits of that task while building the case for the slower, more substantial work that actually moves performance. The research on 1,885 pages gives us the evidence to have that conversation clearly.

A Note on How GEO Advice Gets Distorted

The schema correlation statistic will continue to circulate. It is too clean and too shareable to disappear. The problem is that every time it gets shared without the caveat about the longitudinal data, it sets up another round of misallocated effort - teams spending weeks on schema audits expecting AI citation improvements that do not materialise.

This pattern is familiar from early SEO. Keyword density. Meta keywords. Exact-match anchor text. Each had a correlation with high-ranking pages at some point. Each became a cottage industry of optimisation that eventually proved far less effective than its proponents claimed. GEO is still early enough that these distortions can be corrected before they calcify into received wisdom.

The right response to research like this is not to dismiss structured data, but to be precise about what it does and does not do. For brands making decisions about AI search visibility in 2026, that precision is the difference between a programme that builds real advantage and one that produces a lot of tidy technical output with little commercial impact.

Schema Markup and AI Citations: Correlation Is Not a Strategy

Why the Correlation Looks So Convincing

What Actually Happens When You Add Schema to Existing Pages

The AI Citation Signals That Actually Hold Up

How This Changes the GEO Work We Recommend

A Note on How GEO Advice Gets Distorted