Beantin

James Royal-Lawson

traffic sources

Google Analytics: Updated visit definition is missing visits

Google updated their definition of a visit in the middle of august. I’ve written an explanation in a separate blog post. In general the change is good as it should make the data in Google Analytics easier for the layman to interpret.

What isn’t so good is that Google Analytics isn’t behaving in the way Google describes. It’s not only missing visitors in some situations, but it is also missing some traffic sources – the attribution is totally incorrect for some visits.

Test details

My test was as follows:

Using my Android tablet, I visited my blog a series of 4 times. I used my tablet so that it would be easy to extract my test visits (with little chance of anyone else visiting the same pages from the same sources on that day).

Visit 1

Via a link on one of my old sites, www.ccl4.org 
The browser newly opened 
Not visited beantin.se in the past 30 minutes.

Visit 2

Via Google's search results searching for beantin fishbang
A few minutes after visit 1.

Visit 3

via Google again, this time searching for beantin seo
The browser newly opened
Over 60 minutes since visit 2.

Visit 4

Via a link on another one of my old sites, 503.org.uk
Just a few minutes after visit 3.

According to Google’s new visit definition, this should have been 4 visits, with 4 different traffic sources.

What the data contained

Detail of a screenshot

According to my Google Analytics data, I had made 3 visits. Visit 4 is missing. Instead, you can see that the beantin seo search has had 2 page views attributed to it – which you can see from my test actions simply isn’t true.

Showing all 4 visits happened

As a way of confirming that visit 4 really did happen and data was received by Google Analytics, showing the referral from 503.org.uk, I made use of my per visit referrer script.

On beantin.se this script saves the referrer for each visit as a custom variable. The script is run on each page view, and the referrer is saved to the custom variable at the visit level.

This means visit 4 will have over-written the referrer for visit 3 – as Google hasn’t trigger a new visit for visit 4, but there is a page view, so my script grabs the referrer…

Details of a screenshot from Google Analytics showing that a 503.org.uk was a referrer

As you can see from the screenshot, 503.org.uk is there – meaning a visit did come from that site, and there are two page views attributed to it (the page views from visits 3 and 4).

Bug or feature?

I’ve repeated this test on my laptop and examined the cookies after each visit, and Google Analytics is failing to update the traffic source (in __utmz) and subsequently failing to trigger a “new visit” according to their new definition.

A bug or a feature? I say bug… what do you think?

Update 20110915

When researching this blog post, I focused my attention on the __utmz cookie. I’ve just taken a closer look at how both __utma and __utmz are behaving in the above scenario.

Google Analytics is failing to update not only the traffic source, but also the visit count and the various timestamps stored in __utma detailing when you last and current visits took place.

This means that even more reports in Google Analytics could be affected (depending on your visitor patterns)


is a freelance web manager and strategist based in Stockholm Sweden.

How to track per visit referrer with Google Analytics

Prevously I wrote about how traffic sources in Google Analytics perhaps aren’t what you think, mainly due to GA’s attribution of page views, visits & visitors to the latest source. It’s not possible using out-of-the-box Google Analytics for you to see the full referring page for each individual visit.

Use Custom Variables

It is possible to use a custom filter to see the full referrer, but it’s also possible to collect the URL of the referring site by making use of custom variables and a bit of javascript. With the same technique you can also store the search phrase for those visits that came via a search engine result page.

Google analytics search phrases

The technique described below isn’t 100% accurate, (some situations cause the referring URL not to be passed on; such as opening links in new windows in Chrome) but then many aspects of Google Analytics aren’t 100% so I don’t think I’m leading you astray.

Step one: add _setCustomVar lines to your tracking code

Will Critchlow’s post describing how to implement first touch tracking article inspired me into using custom variables to record the referrer URL of each visit as well as any associated keywords.

I re-wrote the _setCustomVar lines in his code to use the new asynchronous format. What this following piece of code does is to send the referring URL and keywords to Google if a referrer exists. If no referrer is present it sends “Direct”, so we can track all direct visits too. the “2” at the end of each setCustomVar tells Google Analytics that it’s a visit level variable.

It also filters out your own domain, so that your data doesn’t get polluted by people following internal links from one page to another.

This “if” statement need to be placed in your code after the _setAccount and before the _trackPageview.

var refurl = document.referrer;

  if (refurl != '')
  {
   if ((refurl.indexOf("://"+document.domain))<0)
   {
     _gaq.push(['_setCustomVar', 1, 'Ref', 
        truncate(refurl.substr(7,refurl.length)), 2]);
     _gaq.push(['_setCustomVar', 2, 'Qry', 
        getkeywords(), 2]);
    }
  }
    else
  {
    _gaq.push(['_setCustomVar',1,'Ref','Direct', 2]);
    _gaq.push(['_setCustomVar',2,'Qry','', 2]);
  }

Step 2: Truncate just in case

As Will mentions in his article, Google Analytics limits the length of the data you can send
(including the variable name) to 64 characters – or rather, it ignores anything bigger. So I borrowed his truncate function. I’ve altered it so that we can use three-character variable names (I thought that single character variable names was a little too cryptic for my use)

function truncate(input) {
  var byteLength = 61;
  return decodeURIComponent(encodeURIComponent(input)
.substr(0,byteLength));
}

Step 3: Setting the query parameter

As there isn’t a standard parameter for the search query across all search engines, I needed to make a function that could deal with the major ones that used something other than “&q=”. I saved a bit of time by looking at a php function for displaying the referring page. It’s obviously no problem to add more conditions to catch other search engines if your site receives traffic from one that isn’t captured correctly.

function getkeywords() {
  var x = document.referrer;
  var keywords = 0;
  if (x.search(/yahoo/) != -1) {
    keywords = gup("p"); 
  }
  else if (x.search(/digg/) != -1) {
    keywords = gup("s"); 
  }
  else {
    keywords = gup("q"); 
  }
  keywords = truncate(keywords.replace(/+/g, " "));
  return keywords; 
}

Step 4: Extracting the keywords

In Will’s original First Touch post, he saved the query string unaltered with no tidying up or further parsing. I adapted the code from this article that parses the URL of the current page so that it parses the contents of document.referrer. At the time of writing, Google Analytics has a bug in it which means custom variables get spaces displayed as %20 in reports.

This is the routine that the getkeywords function above calls once we’ve worked out the query parameter.

function gup(name) {
  name = name.replace(/[[]/,"\[").replace(/[]]/,"\]");
  var regexS = "[\?&]"+name+"=([^&#]*)";
  var regex = new RegExp( regexS );
  var results = regex.exec( document.referrer );
  if( results == null )
    return "";
  else
    return results[1];
}

Sit back and wait

After a few hours you’ll be able to find some results via custom reports (and perhaps “Visitors -> User defined”) but it can take a few days before results show up under the Custom Variables report.

Once they do start to appear, you should see something similar to that in the picture below.

Google Analytics custom variables

Now you are collecting referrer information on a per visit basis, including if the visit is direct – as well as all the associated search queries. It should also be relatively straight forward extend this technique to track other per visit information too, but we’ll save that for another day…

Updated: 2011-01-17

I’ve updated the code above to take into account situations when the refering URL is your own domain.

Explained: Sources in Google analytics

Looking at various related blog posts, I’ve realised many people don’t fully understand or fully explain how traffic sources are attributed in Google Analytics.

A Cooked named ___utmz

Let’s get straight to the details about sources…

  • Uses a cookie called ___utmz
  • Only gets updated each time the source is different to the source stored in the cookie (excluding direct visits)
  • The utmz cookie lives for 60 days since it was last updated

If you want a full run down then Analytics Market give an excellent and detailed explanation of all the Google Analytics cookies on their blog.

Detail of a screenshot from Google Analytics

All this means that if a visitor reaches your site (irrespective of landing page) via Google, then that visitor (note visitor not visit or page view) will have Google attributed as the source for every page they look at across every visit they make to your site. This will be the case until 60 days have past or the very same visitor comes in from another source (such as a link in a newsletter, or by clicking on a banner, or on an adwords ad)

Understanding Google Analytics reports

Make sense so far? The next part is to understand how this affects the way you read various reports in Google Analytics. Take the Top content report for example. Say your top page has had 5000 page views during the past month. Segment those by Source and perhaps 3000 of them are attributed to Google.

The easy conclusion to make is that those 3000 page views are directly attributable to Google; that an organic search in Google for a particular phrase led to the visitor clicking on a particular search result and visited that page on your site. In old-school log-file-based analytics, then yes, that would be the case (substituting Source for referrer).

Detail of a screenshot from Google Analytics

In Google Analytics the real explanation is that 3000 of the page views were displayed to visitors who had, at some point during the previous 60 days, arrived at your site after searching for something in Google and, if they made any repeat visits, then all of those repeat visits were direct.

Over-estimating the importance of Google

What this means is that unless your traffic consists of one-time-visitors and nothing else there’s a good chance you’re been over-estimating the importance of Google searches in generating page views. Unless you alter the default setting of the campaign cookie from 60 days to 0, then (apart from new visitors visiting once in the view time period) you can’t correlate page views/visits with their actual sources.

Understanding per-visit behaviour

Whilst I understand the usefulness of attributing sources for all subsequent direct visits for conversation analysis and goal tracking (it’s useful to know which initial source ultimately led to the conversation) it’s of much less use in understanding the per visit (and thereby more complete) behaviour of your visitors.

Reload this page with responsive web design DISABLED