Skip links

Spam Bots: The Enemy to Analytics

Your traffic is booming! Can you believe it? Traffic is up 30% month over month, with a 40% improvement the previous month.  In a word, your website is killing it.

Except, strangely almost all your traffic seems to be referral, or at least the largest source is now, where previously it was maybe third or fourth in traffic sources.  So what changed? And why are you seeing as a referral source?  What is THAT?

Spam bots. That’s what it is.  To understand what it is, let’s go through some basic definitions.

First off, what’s a bot? A bot is a program developed to perform repetitive tasks (crawling) with a high degree of accuracy and speed. In most cases, they’re used to gather data about the web (which can be both harmless or helpful), but they can also commit click fraud, gather email addresses or scrape content, artificially inflate website traffic, etc.  We’re seeing more and more that are forcing the artificially inflated web site traffic side of things, but different bots do different things.

If you’re getting spam referrers, there are ways to deal with the issue, all with varying degrees of difficulty, and there are several valid reasons to remove them. Fundamentally, for us on the marketing side of things, it trashes your analytics data.  Every metric begins to be suspect when they mess with your results to this degree – especially on sites with lower overall traffic.  It can make it near impossible to make marketing decisions based on analytics when the metrics are suspect – it skews bounce rate, traffic sources, referral traffic, time on site, pages per visit… all of it.  Also, there is a server load time and security concern for the issue.  These visits are unnecessary and unwelcome resulting in higher bounce rates and lower rankings, which can also overload servers and thus, load times.

Some spam bots can send fake traffic even without visiting your website. They do that by reproducing HTTP requests which comes from your analytics tracking code and using your web property ID.  Since they don’t visit your website, their visit is not recorded in your server log, so you can’t block them through IP blocking, user agent blocking, referrer blocking, etc.

Here are some examples of sites you may have seen pop up in your analytics referral sources:,,,,,,,,,,, etc.

So what can you do?

  • Filter results in analytics. There is functionality to remove the exclude known bots from analytics results. We now do this as a default for all our clients and have retroactively checked it for all clients. This removes many of the well-behaved bots and spiders but doesn’t help the real web creepers, but it is a good place to start. Go into your GA property admin settings and “Exclude Known Bots.”
  • Since not all bots follow the rules of bot-courtesy (not a real phrase, but it is a fun one I just made up) and they love to creep around the web, grabbing information for questionable purposes, so they require a different tactic. In some cases, like the Semalt crawler, you can go to their website and ask to have your site excluded from their crawler. In many other cases, the last thing you should do is to visit the referring site, since this is an invitation to get a virus or Trojan infection on your computer. Do a quick Google search first to see if you can trust it, then use the .htaccess file to remove and eliminate them at the webserver level.
  • Set up custom filters for your analytics account. Don’t forget to leave an unaltered, filter-free view, but customize your filters to remove, block, and filter out as much of the bad as possible.

One thing to know about this is that with spam, in particular, it’s a constantly evolving industry.  It’s sad that that’s true, but it is, and it’s something all webmasters need to be aware of.  These best practices will change and they will change quickly.  Watch your analytics, stay informed, protect your site, and be vigilant.  Those are the real strategies to defeat spam bots.

Want to know more? Request a quote from our office or an analytics baseline so we can get an idea for your analytics needs and help you recover from spam bots.