In web analytics we usually focus on things that have happened – visits, visitors, pageviews, events, conversions. You generally know what your most popular pages are. You probably have a good idea of the most common paths through your site.
How about taking a look at what isn’t happening?
Pages that receive very few or no visits are almost certainly not doing their job. At best they clutter your navigation and unnecessarily increase choice – distracting your visitor.
At worst they are the symptom of a more complex problem – perhaps something is broken, a serious SEO issue, or the usability is poor.
In Google Analytics it’s quite straight forward to bring up a report showing your “bottom content” by choosing “top content” then clicking on the “pageviews” column to reorder the table to show pages with just 1 pageview first.
A more complex task is producing a list of pages that no-one has visited at all.
A page that receives no visits adds no data in Google Analytics. Not even in your server log files. Graveyard pages. How do you measure those?
To find zero visit pages you need to first create a list of all the URLs for your site. If you’ve got a site with 500 pages or less then you can use http://www.xml-sitemaps.com/ to create a plain text list for you. It’s best to trim the domain name from all the URLs so that it matches the domain-less data you’ll export from Google Analytics.
Then you need to export a list of pages from Google Analytics that have received visits. In Google Analytics, go to “Content”, “Site content” then “All pages”.
Adjust the date range to a suitable time period – depending on the level of traffic to your site, the default of 30 days might be too long or too short.
Make sure you increase the number of rows shown to include all your pages (max 500). Then export the data as CSV.
Depending on your site and the data you’ve extracted from Google Analytics, you might want to tidy up the exported links by removing any unwanted parameters from the end of URLs. (In the example spreadsheet I’ve done this automatically.)
Now here’s the magic. Put both lists of URLs into separate columns in a spreadsheet. Then apply a formula in a third column that finds matches in the two columns of links.
Exactly how that formula looks varies from spreadsheet to spreadsheet. In a Google Docs spreadsheet, presuming you have your list of pages in column A and the list of visited pages in column C, the formula for row one is as follows:
This scary looking formula checks the URL in cell A2 against all the URLs in column C (starting with cell C2). If it gets a match, a “yes” is shown in the cell. If there aren’t any matches found then it outputs a “no”.
To save you a bit of time, I’ve created an example spreadsheet in Google Docs using some data from beantin.se. Use this as a template for your own list. Just replace columns A and C with your own site’s URL lists.
The pages with a “no” have received zero visits during the time period being analysed (as determined by the date range when you exported from Google Analytics). These are your graveyard pages.
If you have a site with more than 500 pages, you’ll need to find another way of creating a list of your site’s URLs. Hopefully you’ve got a sitemap.xml that you can use.
When extracting from Google Analytics, you’ll have to extract your pages in sections making use of content drilldown reports.
Really really dead
One final tip – Select a longer date range when exporting from Google Analytics if you want to find the really really dead pages.
If you have any pages with seasonal content, you probably want to choose a date range that includes that particular date so they don’t accidentally get classed as dead!
— James Royal-Lawson (@beantin) November 8, 2012