It’s been a month of traffic from Bad bots – yes, recently my website received thousands of impressions from a bad referral website called clictune. It took a while for me to understand and stop those unwanted traffic from eating my server resources. For the past couple of days I see something like ‘WeSEE_Bot:we_help_monitize_your_site‘ in my Google Analytics report which claims to be a Browser and drives hundreds of visits to my site’s homepage. After googling, I understand that the traffic is being generated by an advertising agency called wesee dot com. I also read few forums where many website owners reported similar observation.
There are couple of ways to stop ‘WeSEE_Bot:we_help_monitize_your_site‘ bot from crawling your website. The method one – disallowing crawl via robots.txt file (but according to a post at CARA, the bot seems to request robots.txt and continues to crawl the website). The second method – blocking using .htaccess.
Just copy and paste the below lines in .htaccess:
BrowserMatchNoCase WeSEE_Bot bad_bot Order Deny,Allow Deny from env=bad_bot
In order to block a bot by a user agent string, look out for a part of the string that’s unique to that robot. For eg., ‘WeSEE_Bot‘ from ‘WeSEE_Bot:we_help_monitize_your_site‘. The first line ‘BrowserMatchNoCase‘ checks the user agent string that contains ‘WeSEE_Bot‘ and sets an environment variable called ‘bad_bot‘.
Note : Since we’ve used ‘BrowserMatchNoCase‘, you can specify both the uppercase and lowercase letters. If you are very particular about the case, then you can use ‘BrowserMatch‘.
The last line tells the server to deny request from ‘bad_bot‘. Let me know if you find a better way to solve this issue.