Google Analytics Advanced Segments: fuzzy understanding

Something is bugging me.  A respected web analytics expert, blog writer and Google Analytics evangelist Avinash Kaushik has disturbed my equilibrium on the subject of Advanced Segmentation in this post.

To cut a long blog-comment-ping-pong game short, I drilled down to a single fact/assumption that underpinned his analysis which I’m not 100% sure about.   The assertion is most easily explained with an example:

1. A visit to the website occurs 30 days ago from a ppc campaign “randomCampaign”.  You can set up an advanced segment to constrain your reports by “Campaign=randomCampaign” so you can see visits to the website from this campaign.  All good stuff.
2. Now imagine that the same visitor decides to visit your website again, but this time as he knows the url he types it into his browser and for this visit is therefore “direct”.
3. In a couple of days we run a report on Google Analytics constrained by the “randomCampaign” segment for a time-span of say, 5 days.

Underpinning the methodology Avinash is using, he is expecting that the report in step 3. includes the visit that occurred in step 2., despite that visit being a “direct” visit and not a “randomCampaign” visit.

My understanding is that the segmentation will act on the visits that occurred in that time-period and the dimension that is being segmented by must apply to those visits, not previous visits outside the date range.

Now we don’t seem to agree on this, but luckily it is something we can test.  So starting today, I’ll begin an experiment to test the above and I’ll report my findings here in a week or so…


Couple of days later…

So here is the quick test I ran:

1. I cleared all the cookies from my browser.
2. I visited my website from a referral:  I made sure I knew which page I visited.
3. The next day, I went to the website again, but this time I typed the URL directly.

My original thinking was that visit in step 2. would be a source=referral, and the visit in step 3. is a source=direct.  Wrong wrong wrong!!

Despite the visit in step 3. being a ‘direct’ visit, GA is clearly reporting the source for both visits as the original source that is associated with me.  This explains why when you set up an Advanced Segment, say for this example where ‘source =’, both visits in steps 2 and 3 will appear in your segmented reports.  Which validates Avinash (surprise surprise) and forces me to accept a whole new paradigm…

So in summary: when you view the ‘direct’ report in GA, the visits that are registered may or may not have been ‘direct’ – only their very original visit to the website was ‘direct’…  This applies to any of the source dimensions – which poses several interesting questions…

Edit: further clarification here at Justin’s Epikone blog on how GA tracks bookmark visits.


