Comments on: Getting rid of errant HTTP requests http://www.stdlib.net/~colmmacc/2005/11/24/getting-rid-of-errant-http-requests/ An Irishman's Fiery Tue, 17 May 2011 19:12:51 +0000 hourly 1 http://wordpress.org/?v=3.3.1 By: Ian Holsman http://www.stdlib.net/~colmmacc/2005/11/24/getting-rid-of-errant-http-requests/comment-page-1/#comment-23 Ian Holsman Fri, 25 Nov 2005 02:10:49 +0000 http://www.stdlib.net/~colmmacc/?p=74#comment-23 Hi Colm. a better approach to the redirect would be to stick the random string inside the URL itself. something like /thiscgi/$RANDOM/go/away so you catch things which strip out query args. and you might even be able to do this via mod-rewrite, using something like request-time instead of RANDOM. instead of a CGI. also..remember to put Disallow: /thiscgi into your robots.txt file, so you can claim you gave the bots fair warning ;-) but from my experience, it doesn't help. the stupid client bot's still keep coming. I've one (from the same IP) come for over 2 years. yes.. i've been feed him crap for 2 years, and he still keeps on coming... the only solution I know is to wait until that machine that he has his cron job on gets reinstalled. Hi Colm.
a better approach to the redirect would be to stick the random string inside the URL itself.
something like

/thiscgi/$RANDOM/go/away
so you catch things which strip out query args.

and you might even be able to do this via mod-rewrite, using something like request-time instead of RANDOM. instead of a CGI.

also..remember to put
Disallow: /thiscgi
into your robots.txt file, so you can claim you gave the bots fair warning ;-)

but from my experience, it doesn’t help. the stupid client bot’s still keep coming. I’ve one (from the same IP) come for over 2 years. yes.. i’ve been feed him crap for 2 years, and he still keeps on coming… the only solution I know is to wait until that machine that he has his cron job on gets reinstalled.

]]>
By: colmmacc http://www.stdlib.net/~colmmacc/2005/11/24/getting-rid-of-errant-http-requests/comment-page-1/#comment-22 colmmacc Thu, 24 Nov 2005 22:21:23 +0000 http://www.stdlib.net/~colmmacc/?p=74#comment-22 Using CGI will add one extra process to the mix, per teergrube but it's not going to be too resource intensive. Unfortunately the thread/process getting tied up problem doesn't go away with a module, and if the system has some problems with process exhaustion or if MaxClients is being hit, then the long-lived sessions will definitely cause problems. However, the <a href="http://httpd.apache.org/docs/2.2/mod/event.html" rel="nofollow">event mpm</a> offers a good solution to this, as it can pool the long lived connections the way it handles keepalive connections, leaving the ordinary worker threads free for responsive handling of real requests. Another solution would be to send a regular httpd a graceful-stop, and then start signal every now and then. The existing connections will be maintained and continue to be teergrubed, but a new httpd instance with all of MacClients available will be back. And the admin can renice the old httpd to slow it down even further. We wern't being harvested for email addresses, so at most we had to teergrub a few dozen connections at once. Using CGI will add one extra process to the mix, per teergrube but it’s not going to be too resource intensive. Unfortunately the thread/process getting tied up problem doesn’t go away with a module, and if the system has some problems with process exhaustion or if MaxClients is being hit, then the long-lived sessions will definitely cause problems.

However, the event mpm offers a good solution to this, as it can pool the long lived connections the way it handles keepalive connections, leaving the ordinary worker threads free for responsive handling of real requests.

Another solution would be to send a regular httpd a graceful-stop, and then start signal every now and then. The existing connections will be maintained and continue to be teergrubed, but a new httpd instance with all of MacClients available will be back. And the admin can renice the old httpd to slow it down even further.

We wern’t being harvested for email addresses, so at most we had to teergrub a few dozen connections at once.

]]>
By: jmason http://www.stdlib.net/~colmmacc/2005/11/24/getting-rid-of-errant-http-requests/comment-page-1/#comment-21 jmason Thu, 24 Nov 2005 19:29:33 +0000 http://www.stdlib.net/~colmmacc/?p=74#comment-21 I tried similar with a case of referrer spamming, a while back. It's worth noting that CGIs aren't a viable tactic for #2 on smaller sites, since keeping that CGI running takes up an apache thread (as far as I could tell) --- too many of those, and your site stops responding since all its threads are busy running teergrubes. I tried similar with a case of referrer spamming, a while back.

It’s worth noting that CGIs aren’t a viable tactic for #2 on smaller sites, since keeping that CGI running takes up an apache thread (as far as I could tell) — too many of those, and your site stops responding since all its threads are busy running teergrubes.

]]>