Google Analytics _setVar fixed to not break bounce rate

Great news!  Google have fixed the use of the _setVar method that sets the custom variable in Google Analytics – useful for recording other dimensions that you might want to segment by e.g. Premium customer VS Standard customer.

Previously, using this variable completely broke the bounce rate reporting as it counted the usage of _setVar as an ‘interaction’.  A bounce happens when the visitor only has 1 interaction with the site, so even if they only visited 1 page but the _setVar was used, it wouldn’t be counted as a bounce because pageview+setVar = 2 interactions.  The following image shows what happened to us the last time we tried to use the _setVar method…

_setVar destroyed our bounce rate metric!

_setVar destroyed our bounce rate metric!

Advertisements

Google Analytics Advanced Segments: fuzzy understanding

Something is bugging me.  A respected web analytics expert, blog writer and Google Analytics evangelist Avinash Kaushik has disturbed my equilibrium on the subject of Advanced Segmentation in this post.

To cut a long blog-comment-ping-pong game short, I drilled down to a single fact/assumption that underpinned his analysis which I’m not 100% sure about.   The assertion is most easily explained with an example:

1. A visit to the website occurs 30 days ago from a ppc campaign “randomCampaign”.  You can set up an advanced segment to constrain your reports by “Campaign=randomCampaign” so you can see visits to the website from this campaign.  All good stuff.
2. Now imagine that the same visitor decides to visit your website again, but this time as he knows the url he types it into his browser and for this visit is therefore “direct”.
3. In a couple of days we run a report on Google Analytics constrained by the “randomCampaign” segment for a time-span of say, 5 days.

Underpinning the methodology Avinash is using, he is expecting that the report in step 3. includes the visit that occurred in step 2., despite that visit being a “direct” visit and not a “randomCampaign” visit.

My understanding is that the segmentation will act on the visits that occurred in that time-period and the dimension that is being segmented by must apply to those visits, not previous visits outside the date range.

Now we don’t seem to agree on this, but luckily it is something we can test.  So starting today, I’ll begin an experiment to test the above and I’ll report my findings here in a week or so…

==================================

Couple of days later…

So here is the quick test I ran:

1. I cleared all the cookies from my browser.
2. I visited my website from a referral: gatest.somewhere.com.  I made sure I knew which page I visited.
3. The next day, I went to the website again, but this time I typed the URL directly.

My original thinking was that visit in step 2. would be a source=referral, and the visit in step 3. is a source=direct.  Wrong wrong wrong!!

Despite the visit in step 3. being a ‘direct’ visit, GA is clearly reporting the source for both visits as the original source that is associated with me.  This explains why when you set up an Advanced Segment, say for this example where ‘source = gatest.somewhere.com’, both visits in steps 2 and 3 will appear in your segmented reports.  Which validates Avinash (surprise surprise) and forces me to accept a whole new paradigm…

So in summary: when you view the ‘direct’ report in GA, the visits that are registered may or may not have been ‘direct’ – only their very original visit to the website was ‘direct’…  This applies to any of the source dimensions – which poses several interesting questions…

Edit: further clarification here at Justin’s Epikone blog on how GA tracks bookmark visits.

Premium version of Google Analytics

Before you get excited, no, there isn’t a premium version of Google Analytics yet.

Edit (29/09/11): They released it yesterday: and according to a couple of sources, it’s going to cost about $150k per year. 

But I sincerely hope that they are working on one – I’ve mentioned before that it’ll be great news if Google release a version of Google Analytics that you have to pay for.

So here I’m beginning to compile a list of the things that I come across every day that I wish is in Google Analytics, and would be happy to pay for:

In no particular order…

  • Advanced Segmentation: I would love for AS to be enabled for Goal Funnels…
  • “This report is based on sampled data.  Learn more.”  I would really like to not see this message.  It means that the numbers I’m looking at are not the real numbers.  Even if it took a little longer for the report to run, I’d like the ability to see the complete picture and not their “sampled data”.
  • Data backup.   One day it might happen – someone with administrator access on my account accidentally (or maliciously – eek!) deletes my GA account.  Disaster!!!
  • Retrospective filters.  You apply a filter and it acts on your data retrospectively Implemented with Advanced Segmentation.  Thanks!
  • Multiple dimensions.  At the moment you can set a ‘user defined’ variable, but you cannot add several custom dimensions that you can use to slice and dice your data
  • Annotation.  Like the charts you see on Google Finance or Google Trends, I’d like to be able to annotate my charts with significant events that I know of that might explain certain trends etc.  Done!
  • More AdWords integration.  Sometimes there are campaigns and AdGroups that you see in analytics that confuse you – which ad is that again?  To be able to click through to see the ad text and other details would be nice
  • On charts overlay profiles (or segments as they might be called) – so if you have multiple profiles set up for a single website with each profile having a different filter applied (you might have one that shows organic search only and another with paid search only), I’d like to be able to take any chart and add data to it from another profile.  So you’d get a data series for each profile you added. Implemented with Advanced Segmentation
  • Related to the previous one – it would help enormously if you could duplicate a profile from an existing profile – that way you don’t have to set up your goals etc all over again… Implemented with Advanced Segmentation
  • Ecommerce transactions.  These are a must for any website that sells stuff online.  When you look at a particular report, you can click the Ecommerce tab and see the revenue associated with that referral source, or region etc.  It even tells you the number of transactions that are associated with that record.  What I’d *love* to be able to do is then click on the ‘transactions’ value, to arrive at the list of those transactions so I can identify the products purchased.  I will then be able to quickly ask the question: “what do people that come from X tend to buy the most”.  Instead what I have to do now is set up an Advanced Segment constraining the report by X in order to see what the transactions were, and whilst this works, it is a very time-consuming process.

Googlebot not following sitemap URLs faithfully

Here’s a little background first.

We have implemented a URL validation step when we process a response
to make sure that when people call a page they use the correct URL.
If they use an incorrect URL, then they are sent a 301 redirect with
the correct URL.

The URL in our sitemap is in the format:
http://www.domain.com/index.html?whatever=value

We’ve now had errors showing up in Webmaster Tools, with it saying that Googlebot is coming across too many redirects in our sitemap URLs.  The problem with Googlebot is that even though we put the correct URL in the sitemap, it doesn’t use that URL to make the request – it omits the index.html bit, contracting it down to:
http://www.domain.com/?whatever=value

So our server sees this ‘incorrect’ URL, issues a 301 with the
‘correct’ URL (that has the index.html bit in it), but then Googlebot
doesn’t follow that URL faithfully and again tries to request the URL
without index.html in the path.  So our server again issues a 301
redirect, with the correct URL and here we go off on our infinite
loop.

So no wonder we get the error message:
URLs not followed….

contained too many redirects.

I think this is a bug as the 301 redirect clearly sends the redirect
URL, if Googlebot followed this redirect URL faithfully then we
wouldn’t see this issue.

Here is the sitemap error in more detail (substituted our actual domain for a pretend one).

HTTP Error:
Found: 301 (Moved permanently)

http://www.domain.com/?param=whatever1
http://www.domain.com/?param=whatever2
http://www.domain.com/?param=whatever3
http://www.domain.com/?param=whatever4
http://www.domain.com/?param=whatever5
Jul 20, 2008

Double checking the sitemap file, these URLs are in the right format complete with index.html.

Why does Googlebot strip out index.html?

Google’s website optimizer and ajax

A couple of weeks ago Google launched their Website Optimizer product out of beta – it is now a fully fledged standalone product (previously you had to use it via an Adwords account). I was playing with it today because I wanted to make sure that we could test with dynamic content.

A typical A/B or multivariate test might take a page portion and then serve up several static variations. Sometimes static variations aren’t good enough though. Most eCommerce websites are database driven and use templates for product pages that are populated with information specific to that product. The template knows what product info to load in because the page might be accessed via a URL with identifiers in the query string: e.g. http://somesite.com/product.html?productid=1234

This product page knows that it has to load up the details for product 1234.

When you want to start doing more complex A/B tests, where the data for your variations also comes from the database, you have a slight problem in that the alternative content for the test is managed in the website optimizer interface – how do you get dynamic content out of your database for your test variations?

To get around this, you can use Ajax to grab the dynamic content relevant to that particular product page, and use the Website Optimizer to simply modify parameters in the Ajax call. This might be implemented by creating four server-side functions that are accessed by Ajax, each returning a variation on the original test content.

In Website Optimizer, when you declare which part of the page you are testing, rather than wrapping the content section, wrap the piece of Javascript that sets which function the Ajax request will call (or Javascript that sets an Ajax parameter):

<script>utmx_section(“AjaxSection”)</script>
<script>aj_fn = “variation1”;</script>
</noscript>

Then, when you proceed through the experiment designer to add new variations, just add:

<script>aj_fn = “variation2”;</script>

Where “variation2” is the name of the function the Ajax will call to return the “variation2” content.

Alternatively, as mentioned previously, instead of creating a function per content variation just alter an Ajax parameter so that the function returns different content.

This combination of Website Optimizer and Ajax makes for an extremely powerful technique. It’s pretty easy to implement too.

Google Analytics feature request…

For a free package, you cannot beat Google Analytics. But now surely we are getting to the point where the clever engineers behind the scenes are building a list of new features that will be bundled into a ‘premium’ package, where a subscription fee will be levied.

Personally, I would be over the moon if this were to happen, because then we would be able to request features with more of an expectation that they will take them seriously (not that they don’t now, it’s just that if we paid for it then they would have to take us even *more* seriously).

One of the good things about GA is that they keep your analytics data for a very long time. We’ve had our account with them since 2006, and being able to go back that far to analyse traffic and behaviour is very powerful. Sometimes though, it would be nice to be able to delete or ignore some data – for instance one particular institute in Tempe, US, decided to build a bot that executes javascript and then crawl all over our site. For the most part, we can happily use GA in the knowledge that most spiders don’t execute javascript, but this javascript-executing-bot now appears in my GA data (as GA data-collection is javascript driven).

So I’ve got this nasty spike of data that I’d just like to be able to select, then hit the ‘ignore forever’ button.

Annoying bot

I guess, that when Google do decide to tap into the thousands of organisations that really want more features and are happy to pay a premium, this would be one of the many features I’d ask for… as well as more Goals, better page-flow analysis, page-rendering-time data, more than one custom dimension, the ability to break out traffic from Google across the country-specific domains, etc etc etc… 🙂

Google analytics – zero visitors but 30,000 pageviews?

Surely something wrong here – look at the following graphs: circled in red are the visitors and pageviews for Monday – how have we got 30,000 pageviews with zero visitors? Zero visitors, but 30,000 pageviews

Edit: Ok – this is me getting too keen to see the data before it is ready.  Apparently, the visitors number is updated less frequently than the pageviews number so it is possible that visitors hasn’t been updated at all for that day, but the pageviews has. So, if this is true, later on today I should see the visitors number climb upto normal levels…  Funny, I’ve been using GA for years and this is the first time I’ve noticed this.