Abuse of precious karavshin isn't limited to unsightly comment spam. Doing vanity searches through my http logs shows that I've been getting hammered by a number of rude guests.
Prime Perp is the NameProtect NPBot. It's a hideous and notorious organism that ignores all codes of civil conduct and regularly downloads my entire website and, by my logs, seems to waste at least 8MB per month.
Its purpose? Paying customers have it search for copyright abuse. Let's see... does that benefit me? no. does it benefit my readers or friends? no. Does it benefit NameProtect only? yes.
Ok, sign the Death Warrant and release the Flying Field Tribunal.
In fighting NameProtect, I also found a list of several other similar monsters. Spreading the love, here are the entries for your own apache .htaccess file:
#nameprotect Deny from 12.148.209.192/26 Deny from 12.148.196.128/25 # cyveillance.com deny from 63.148.99.224/27 deny from 65.118.41.192/27 # branddimensions.com user-agent: BDFetch deny from 204.92.59.0/24 # www.markwatch.com user-agent: markwatch deny from 204.62.224.0/22 deny from 204.62.228.0/23 deny from 206.190.160.0/19 # rocketinfo.com deny from 209.167.132.224/28
It didn't stop there, I looked through my logs and saw a few other sites that had gobbled a huge share of my bandwidth. My criteria was simple, go to their website and if there was no compelling reason that they'd be behaving this way, nuke them.
Death Sentences were issued for:
berg.dbsmarketing.net (no web presence at all)
looksmart.com (some stupid search engine)
and
informatike.ilab.sztaki.hu
Now I feel a bit sorry for sztaki. They do appear to be a legitimate organization, but their bot really gobbled down a lot of pages. By my rules, it doesn't benefit me or my friends, and it's acting crassly, so I am banning it. I suppose if someone contacts me with some compelling explanation, I'll exonerate it, but for now I will assume the worst.
Very provisional survival granted to:
Alexa. This goddamned search engine ostensibly is mirroring Karavshin for perpetuity at the same time it is building its search indices. By my logs, and its archives, it seems to be doing a lot more stuff that benefits it than me. If my archive isn't updated soon (February is the last update) this motherfucker is getting the Death Sentence also.
Anyway, this looks like it will be another arms-race, similar to the comment-spam issue. Fortunately I have logs generated from webalizer, which will point out rude, piggish guests, and then I'll simply Deny them in my .htaccess log.
====
Update: November 1, 2003
I contacted the Hungarian guys and they fixed whatever was the problem with their bot. (They responded within 2 hours of my email, in fact) Don't ban them.
I just ran
and got several hits.
I think this means my .htaccess file is working. It looks like NPBot sends the http request, but my apache server won't respond, and it gives up. Now I did notice another ip addresse (193.29.77.220) that sucked down a lot of data (22MB) in October. If that continues, I'll assume it's malign and ban it, too.
Posted by Nils Blutig at October 16, 2003 10:14 PM | TrackBack