AI Citation Measurement: The Results of My First Test

Expert Take — Hans Schepers
I expected a gradual picture: some visibility on branded queries, some on generic ones, and growth over time. What I found is a sharp contrast. Every query containing my brand name: cited. Every query without it: nothing. That is not a gradual difference — it is zero versus a hundred percent.

How I measure

Quolity is a free Chrome extension that exports ChatGPT’s query fan-out — the internal search queries the model generates before producing an answer — along with all retrieved sources and the final citations. I ran three independent runs per prompt, at different times, to separate coincidence from pattern.

My seven prompts fall into two groups. Branded prompts include my name: “Is Hands on GEO a reliable source on GEO?” and “What does Hands on GEO write about AI visibility?” Generic prompts do not: questions about the best GEO resources, GEO experts in the Netherlands, and a comparison of GEO sources for marketers.

What the data shows

Branded prompts: cited 6 out of 6 times. Both branded prompts — 3 runs each — produce the same result: my site appears in the retrieved sources and in the final citations every time. ChatGPT needs few sub-queries for these questions — an average of 4 to 8 — because it already has a target: searching for my site.

Generic prompts: cited 0 out of 12 times. Across all generic prompts — 12 runs — my site does not appear as a URL citation once. Source pools are much larger here (averaging 45 to 88 sources per run) and fan-outs are nearly twice as large. My brand name does appear occasionally as a text mention without a clickable source. That is a different signal: the model knows my brand exists, but does not yet trust my pages as a primary source.

What ChatGPT says when it does cite me. This was the most useful part of the data. In the export of “Is Hands on GEO a reliable source?”, the citation context reads: “a serious and credible practitioner source” and “useful, often sharp” — but also “not necessarily a neutral or academic authority” and “not suitable as the sole source for hard claims.” That is an honest picture. The missing layer is external validation: what others say about me, not what I publish myself.

Who appears for “experts in the Netherlands”? Sebastiaan Laan appears in five of the six expert runs. Chantal Smink, Julian Smits, Ryan IP, Stijn Bergmans and Jafar Kaboli also appear, with varying frequency. My name is not there. That does not surprise me: I have not published any external articles, given interviews, or spoken at conferences. All those other names are visible through exactly those channels.

What the “compare” prompt reveals. One run, 8 citations: the original GEO research paper (arxiv.org), Ahrefs, Search Engine Land, Semrush, Google Search Central and McKinsey. That is the company I need to be in if I ever want to appear for this type of query. These are not new blogs.

What I think — cautiously

19 measurements is not enough for definitive conclusions, but it is enough to see a pattern.

The contrast between branded and generic is sharper than I expected. Instead of a gradual difference, I see a hard line. The question is therefore not “how do I become more visible”, but “when does an AI model consider a source authoritative on a topic, regardless of whether the name is already known.” That is a different question.

The bottleneck is not content. ChatGPT describes my site as useful but insufficient for authoritative claims. Writing more articles will not fix that. The missing layer is what others say about me — in trade media, via LinkedIn, in event announcements.

The GEO expert landscape in the Netherlands is fragmented. No single name appears in all runs. That means the field is open, but also that there is no established order to push against.

I draw no conclusions about causality or other platforms. All my measurements are via ChatGPT. Perplexity and Gemini work differently. And the variation between runs is sometimes significant — the same prompt produces eight sub-queries in run 1 and thirteen in run 3. That is a property of the system, not noise.

FAQ

Are three runs per prompt enough?
The minimum. Three runs show whether a result is stable or incidental. For definitive conclusions you need more.

Why Quolity and not a paid tool?
Profound offers fan-out data at enterprise level only. Quolity provides the same raw data via a free extension. For a first measurement, that is sufficient.

What does it mean if my name appears in the text but not as a citation?
The model recognises your brand but does not yet trust your page as a primary source. Only the second type generates traffic.

What are the next steps?
Measure more generic prompts, run the same prompts through Perplexity, and automate the export step so manual clicking does not become the bottleneck.

Sources

Own master spreadsheet quolity-master-20260417.xlsx — 19 measurements, 7 prompts, April 2026
Quolity Chrome extension, April 2026 version
Citation context exports per prompt (included in master spreadsheet)

The Results of My First AI Citation Measurement

How I measure

What the data shows

What I think — cautiously

FAQ

Sources

Leave a Comment Cancel reply