Why you can’t trust website analytics

Data is a huge part of any website. It’s important to know exactly who is reading your content and which topics are getting read the most. These numbers help you plan your content.

However, how much can you really trust your website stats, e.g. Google Analytics?

The answer should be not that much.

Data from three separate UK membership organizations shows up to 78% of website visits are not being recorded in Google Analytics.

Initial data discrepancies

Year Out Group were concerned their website visitor numbers in Google Analytics have been dropping year-on-year since 2013. This is despite strong search engine rankings and a new website launched in March 2018.

The new Year Out Group website is hosted with WP Engine. WP Engine provides detailed analytics on the site traffic based on their log files. These log file stats are processed to filter out non-human requests (e.g. search engine bots) from real visitors – just like Google Analytics tries to do.

I compared the traffic reported by WP Engine’s log files with that in Google Analytics.

Web traffic analysis is complex. Each available data set will include some form of filtering or processing. I fully expected some variation between the two data sets in absolute terms, but thought the general trends would be broadly consistent.

I was wrong.

And this is where it gets interesting.

The WP Engine logs show a strong trend of visit numbers increasing, while Google analytics shows a continued gradual decline:

Year Out Group stats. Google Analytics shows a decline while WP Engine logs show a growing audience.

Some variation in the actual numbers is one thing.

But when the two sets of data are reporting fundamentally different trends, there are huge implications.

Further examples

The next step was to see if other sites with the same WP Engine logs available showed similar characteristics.

And they did.

I ran the same analysis for Scottish Association of Landlords and Professional Speaking Association. These two sites had the added advantage of being able to go back a little further with the hosting logs to provide a larger data set.

All three datasets show large differences between visitor numbers. Google Analytics is only capturing 20-80% of the visit numbers from the log analyses.

More crucially, the proportion of log file visits reported in Google Analytics is decreasing for all three sites. The sites with the highest percentage of log file visits in Google Analytics show the sharpest declines:

 

A proportion of log file visits recorded as Google Analytics sessions.

Note the downwards trend across three separate websites.

(The data points with ratios over 100% are due to Cloudflare caching. See notes for details).

Over the same time period, all three sites show strong growth in visits from the log file data while their corresponding Google Analytics data report visits as either static or in decline.

Possible causes

Incorrect data

We have to trust that data available from Google and WP Engine are accurate. There will always be an element of filtering and processing of these data.

I am working on the assumption that all such processing is applied consistently.

Ad blockers, privacy settings, and network filters

Google Analytics relies on the user’s browser to send data to it:

How Google Analytics works under normal circumstances

Clearly, if this information is not sent, their usage will not be tracked in Google Analytics.

Many ad blockers – whose purpose is to hide adverts on the websites you visit – also block tracking and analytics services, including Google Analytics.

Firefox even includes a tracking protection option without the need for any extensions. When enabled, this explicitly stops data being sent a huge list of services, including Google Analytics.

Visit data in server log files are not affected by the use of an ad blocker:

How ad blockers stop Google Analytics collecting data. Note how the log file data are not effected.

(Fun fact: the URL of this image initially included google-analytics, but was then blocked by my ad blocker!)

The use of ad blocking technology is difficult to quantify. Various reports from 2017 (the latest available) suggest ad blockers are used by between 11% and 58% of users.

What does this mean for your website?

Don’t blindly assume that just because some numbers are on a fancy dashboard they represent the truth.

We need to understand how the data is collected and any factors that may affect its accuracy.

Only then can we determine the appropriate level of trust to place in these data and use them effectively in our decision making.

Affected tools and datasets

It is not just Google Analytics data that are being under reported. Other third-party services that rely on the user’s browser to send them data can be affected by ad blockers. These include:

  • Remarketing code – used to target ads to people who have already visited your site. e.g. Google Ads remarketing tags and Facebook Pixel.
  • Marketing automation tools – used to manage marketing activity based on website interactions e.g. Drip, HubSpot, Pardot, Eloqua, ConvertKit, Infusionsoft
  • Content personalization and split-testing tools – used to adjust website content based on user data and/or to test content variations e.g. OptinMonster, RightMessage, Optimizely
  • Other web analytics tools – e.g. Mixpanel, Piwik, Segment, New Relic, CrazyEgg, Hotjar
  • All code managed through Google Tag Manager

None of these tools should ever be considered completely accurate. There are many reasons why they don’t work all the time: e.g. dropped connections; users having Javascript disabled; someone using multiple browsers/devices; navigating to a new page before the code has executed. This has always been the case.

The problem now, based on these data, is that the proportion of our visitors for whom these tools actually work can be tiny. Less than 22% in the case of the PSA.

I now believe trend-based tools, including analytics, are now largely meaningless.

Tools that work on single sample points, e.g. heatmaps and visitor recordings, are less affected. We just have fewer data points to use.

The value in web analytics comes from using relative changes (i.e. trends) in the data to drive further action.

This works fine if we are confident the tools are capturing data from a large and consistent proportion of our audience. We have now shown this is not the case.

But there is more to it than this.

The proportion of visitors being tracked is not consistent. All of the datasets evaluated here show strong declines in the percentage of visitors being captured by Google Analytics – especially Scottish Association of Landlords and Year Out Group.

For Year Out Group in particular, the rate of decline in tracking exceeds the visitor growth rate. This means Google Analytics is showing a downward trend in visitors when visitor numbers in the logs are growing.

What to do about ad blockers and analytics

It can be tempting to follow the path of various media outlets and put up a fight against ad blockers. But I believe that’s a losing battle.

I love my ad blocker. I’m not going to turn it off just so the sites I visit get better stats.

My motivations for using an ad blocker are better security, removing interruptions and improving speed. These mirror the findings from PageFair’s 2017 report into ad blocker usage. Only 6% cited privacy as a reason for blocking ads.

Threats from malware are increasing so the ad blocker is here to stay.

I recommend two routes of action:

1: Keep the analytics, but understand the limitations

Web analytics still have a place, despite these serious shortcomings. They still provide some useful data on the visits that are tracked, such as measuring the relative performance of different content, e.g. Google Ads. (If someone is clicking on your Google Ad, they probably aren’t using an ad blocker).

We just need to remember that we’re only seeing a relatively small portion of the overall traffic.

The proportion of traffic not captured by your analytics is unique to your site. Running a log file comparison like this will provide an indication of the accuracy of your data.

2: Focus on the metrics that actually matter

Web analytics has long been a black hole. It is full of fascinating but often useless stats. It takes sustained discipline and effort to use the information effectively.

A far more effective strategy is improving the metrics that actually matter.

Your most important metrics will be specific to your business. Examples include:

  • Number of orders (per month)
  • Average order value
  • Cart abandonment rate – i.e. the number of carts with products in that haven’t checked out
  • Returning customer rate/customer loyalty – i.e. the number of people who place another order with you in a certain period
  • Customer lifetime value
  • Free trial sign-up rate
  • Free trial to paid plan conversion rate and timescales
  • Mailing list sign-ups
  • Number of user registrations
  • Number of website logins
  • Event registrations
  • Event attendance
  • Inquiries received
  • Email opens and clicks

The good news is none of these metrics (with the possible exception of email opens) are impacted by the use of ad blockers.

Conclusion

This is a long article with a lot of information. To sum up the key points:

  • Standard tools used to measure website usage are missing lots of data due to the use of ad blockers.
  • Blocked tools include Google Analytics (which also reports conversions back to Google Ads), Google Tag Manager, Facebook Pixel and Google Ads remarketing code.
  • The proportion of visitors not being measured can be massive – over 78% in the case of the Professional Speaking Association for example.
  • There is a strong downward trend in the proportion of visitors that can be measured.

This presents challenges for all organizations. I recommend:

  • Understanding and monitoring the level of misreporting of your users. This will inform the level of trust you can have in any tools affected.
  • Focusing on the most important metrics. e.g. email clicks, number of orders and average order values. These are not affected by ad blockers.
  • Consider using alternative technology. e.g. website log file analysis to understand website use. But only if these data provide insights that drive useful actions.

This article was first published by Tall Projects. Please see the supporting notes for details of the methodology used and other important information.

Edward Kay is the owner and founder of Tall Projects Ltd, a membership technology consultancy in the UK. He has over 15 years’ experience as a software engineer and digital project manager. Tall Projects understands both the technology specifics and the business of running a successful membership body. Outside work, Ed is a keen runner having completed five marathons. He lives in Oxfordshire with his wife and two young sons.

The post Why you can’t trust website analytics appeared first on Torque.

Sharing is Awesome, Thank You! :)

Share this Blue 37 post with your friends
close-link