Home » Google Analytics » Sampling in Google Analytics 4 Properties

Sampling in Google Analytics 4 Properties

One of the first questions many users have as they are getting started with the new Google Analytics 4 properties in Google Analytics is if sampling still happens similar to how it does in the current version of Google Analytics. In addition to covering sampling in Google Analytics 4, I will also touch on cardinality and thresholding as well.

Sampling in Google Analytics 4 vs Universal Analytics

Sampling in Universal Analytics

In Universal Analytics, which is the version of GA everyone knows, sampling is something that almost everyone experiences. Explaining when and how sampling happens is actually not that simple as it can occur in a variety of different circumstances.

Almost all of the standard reports in your left-hand navigation menu are available unsampled, however, any time you do certain actions to that report such as applying a secondary dimension or advanced segment(audience) your reports will be sampled if your date range and Universal Analytics property exceed 500,000 sessions. Sampling, as well as cardinality limits, are much more complicated than this as there are a variety of other scenarios that can trigger this. To illustrate, the flow reports have different sampling rules, longer date ranges can cause more sampling, Multi-Channel Funnels is different, and even Google Analytics 360 has its own unique circumstances.

To summarize, sampling in Universal Analytics can occur in many places and as you can clearly see is not easy to actually summarize.

Sampling in Google Analytics 4

Google Analytics 4 has two main components for doing analysis and reporting. The image below shows the breakout between the standard reports and the new Analysis reports. The standard reports in Google Analytics 4 are always unsampled. You can apply comparisons, secondary dimensions, filter your reports and everything will continue to be unsampled.

Analysis works differently, however, and can have sampled data. Analysis, which previously was only available in Google Analytics 360 (Enterprise), is now available for everyone in Google Analytics 4. This is the new free form and advanced analysis tool that will be the place most users spend all of their time doing analysis and reporting. This is where you can create custom reports, funnels, and much more.

In general, anything you do in Analysis that replicates a standard report will be unsampled. Sampling can occur in a few different situations that I was able to observe. The main occurrence of sampling in Analysis is anytime you are exceeding 10 million events and the report you created is not a pre-existing standard report. For example, the cohort analysis will load unsampled for a property I was using with 100 million events, but as soon as I added the date dimension, it returned 10% sampled data (10/100m). Another area, where I observed sampling being triggered in certain situations was when my date range exceeded the last 60 days.

You can see when sampling occurs by looking at the shield in the top-right hand corner of Analysis. It will turn red any time it happens as shown in the image below.

Hit Limits in Google Analytics 4

Google Analytics for Firebase is marketed as a free and unlimited analytics solution. I haven’t been able to find anything for Google Analytics 4 that uses this wording, but from both testing the platform and asking other large volume users, it appears that Google Analytics 4 also is free and unlimited and has no hit/event limits. Universal Analytics has a limit in its terms of service that you shouldn’t exceed 10 million hits per month per account, so this is a huge upgrade! This makes sense, given Google Analytics 4 is really just taking Google Analytics for Firebase AKA Firebase for Apps and adding a Firebase for Web component.

Cardinality (Other) in Google Analytics 4

Cardinality occurs in Google Analytics 4, just as it does in Universal Analytics. I could not find any public documentation on the specific limits, but from testing on high volume sites, it appears that cardinality will occur at around 25,000 unique values per day for any dimension that exceeds this. If you exceed this, like in the image below where we have over a million unique URLs per day, you will see (other) in those reports. I’ve experienced this happening across both the standard reports, as well as Analysis. Hopefully, more documentation will be available on this soon!

Remember Google Analytics 4 includes a free BigQuery integration that has no known limits, so this is the primary way to access all of your raw data if you ever experience sampling or cardinality.

Thresholding in Google Analytics 4

Thresholding also occurs in Google Analytics 4 for certain dimensions to protect the privacy of individual users. You will primarily see this if you access any of the demographic and affinity dimensions in your reports. As stated in the help article, you cannot change these thresholds as they are defined by Google.

Let me know in the comments or on Twitter @charlesfarina / @adswerve if you find any new information or if you are experiencing anything differently!

Leave a Reply

Your email address will not be published. Required fields are marked *