Getting clean AI traffic out of Google Analytics

GA4 already records the visits AI sends you. Its own labeling is the problem.

If you run Google Analytics, the visits that AI tools send you are already in there. You do not need anything extra to capture them, because a person who clicks through from an AI answer is a normal browser visit that GA4 records like any other. The catch is that GA4's own labeling makes those visits hard to find and easy to count wrong. With a bit of setup you can pull an honest number out of it.

Why GA4's built-in view falls short

We pulled a year of data from our own sites and from client properties to see how GA4 files AI traffic. The same three problems showed up on every property.

The first is that GA4 scatters a single tool across several different mediums. On one site, visits from chatgpt.com were filed under referral, organic, (not set), local, and a newer ai-assistant medium, all in the same property and the same span of months. If you look at any one of those buckets, you see a fraction of the real number.

The second is that a simple filter catches the wrong things. The obvious move is to grab every source that ends in .ai, but that sweeps in a long tail of unrelated software companies whose domains happen to end in .ai. Those are referrals, but they are not AI search, and counting them inflates the number.

The third is that GA4's own AI channel is still partial. Google has started filing some of this traffic under a built-in "AI Assistant" channel, which is a good sign and worth watching, but it does not yet catch everything and it lags behind new tools as they appear.

Put together, this means you cannot read AI traffic straight off GA4's channel grouping and trust it. You have to do a little classification yourself.

The approach that works

Pull the raw data and classify it against your own list of AI tools. It is the same method the Visibility Kit plugin uses in the browser, applied to GA4's data instead.

Pull three dimensions, sessionSource, sessionMedium, and date, along with the metrics you care about, usually sessions, totalUsers, and keyEvents for conversions. Then match each row's source against a list of known AI domains: chatgpt.com, chat.openai.com, and openai.com for ChatGPT; claude.ai for Claude; perplexity.ai for Perplexity; gemini.google.com for Gemini; and on through Copilot, Brave, You.com, and the rest. The AI referrers page has the full list. Match on the source, which is the domain the visit came from, rather than on the medium, because the source is the part GA4 gets right.

Two rules keep the result honest over time. Keep GA4's ai-assistant medium as a catch-all, so that anything Google has already tagged as AI still counts even when the source is a tool you have not listed yet. And exclude everything else that ends in .ai, so the random software referrers stay out. With those in place, the count lines up with what the tools actually sent you.

What you end up with

A monthly count of AI referral sessions, broken out by tool, with conversions attached. On our own marketing site that came to a couple of hundred sessions over a year, led by ChatGPT, with Claude, Perplexity, and Gemini behind it. On a busier site it was over a thousand. For most sites these numbers are still small next to total traffic, which is normal at this stage. The value is in watching the trend and in seeing which tool sends people who actually convert, so a real but small number does not get read as either nothing or everything.

What GA4 cannot recover

Some AI visits will never be attributable, no matter how the setup is built. When a tool sends a visitor without a referrer and without a tag on the link, GA4 has nothing to attribute the visit to, so it lands in Direct alongside bookmarks and typed-in addresses. No tool can recover a source the browser never sent. The AI referrers page covers how the referrer signal works and where it breaks down, which is the same limit whether you read the data from GA4 or capture it yourself.