Textpattern CMS support forum
You are not logged in. Register | Login | Help
- Topics: Active | Unanswered
Why is this page so attractive?
It’s not auto-promotion. :) There is a bunch of IP addresses (mainly registered in China) that repeatedly knock at this page of my site. They send a GET, then POST, and then another GET requests, all within few seconds. They do not look like search robots, since this is the only page they visit, many times a day. These junkies continue even if I set Status to hidden, with only one GET request this time, but more frequently. I am confident in Textpattern and wouldn’t mind, if they did not pollute the logs.
What attracts them? Is this a MySQL SELECT query in the text? Does someone experience the same thing?
Offline
Re: Why is this page so attractive?
My immediate reaction would be that they are attracted to the comment form but I have the same problem with http://neme-imca.org/publications/ which has no forms in it.
Oleg, I do agree with you that the main issue is the logs.
Yiannis
——————————
NeMe | hblack.art | EMAP | A Sea change | Toolkit of Care
I do my best editing after I click on the submit button.
Online
Re: Why is this page so attractive?
Comments are enabled on all pages, thare must be other thing they are after. Here is a typical sequence (we care about our visitors privacy :)
01 Nov 2012 09:27:01 218.93.xxx.xxx 218.93.xxx.xxx projet/etc/index.php?id=6 GET 200
01 Nov 2012 09:26:59 218.93.xxx.xxx 218.93.xxx.xxx projet/etc/index.php?id=6 POST 200
01 Nov 2012 09:26:58 218.93.xxx.xxx 218.93.xxx.xxx projet/etc/index.php?id=6 GET 200
There is another page (id=3) that contains SELECT ... FROM ... WHERE
text, it gets some similar hits too, but much less. And the third one (id=10) not at all.
Amazing, yesterday I have redirected some of these IP back home by REMOTE_ADDR
, today they have started to use HTTP_X_FORWARDED_FOR
. Are they humans?
Offline
Re: Why is this page so attractive?
What scares me is that I serve txp_die()
with 503 status to these guys, but some of them somehow bypass it and still get a 200 response.
Offline
Re: Why is this page so attractive?
Hi Oleg,
Did you check their ips on stopforumspam and projecthoney pot? When the ip is listed in either of those places and they are insistently hitting my site I just block it with htaccess.
Yiannis
——————————
NeMe | hblack.art | EMAP | A Sea change | Toolkit of Care
I do my best editing after I click on the submit button.
Online
Re: Why is this page so attractive?
Yiannis, thank you for the links. Yes, they are listed, and will probably finish in .htaccess. But I am curious, how do they bypass txp_die
? I thought the following would stop them:
register_callback('etc_filter', 'pretext_end');
function etc_filter()
{
$banned = array(/*bad guys ip*/);
$ip = remote_addr();
if(in_array($ip, $banned)) txp_die('Unavailable');
}
but some (not all) of them still reach their favorite page.
Offline
Re: Why is this page so attractive?
The remote_addr()
supports proxies and as such it gets it’s information from a X_FORWARDED_FOR HTTP header if deemed necessary.
Last edited by Gocom (2012-11-01 19:01:39)
Offline
Re: Why is this page so attractive?
Yes, but what troubles me, that some IP address listed in the “bad guys ip” array above, still appears in my (txp and apache) logs, with 200 response status. I guess, log hit function uses the same remote_addr()
, so how can this IP bypass my filter? If I set the article status to “hidden”, it gets an 404, as it should.
Offline
Re: Why is this page so attractive?
False alert, sorry, rather a little issue with txp log tab. For some reason, ip addresses therein contain some invisible character after each dot (Firefox on W7). Since I have copied them into my “bad guys” array from txp log tab, they did not match actual ips because of these invisible chars.
Edit: more precisely, it happens for ip in “Host” column.
Last edited by etc (2012-11-02 12:58:49)
Offline
Re: Why is this page so attractive?
^^ that’s due to the use of zero-with space characters to enable wrapping of long words. I think I’m to blame for them occurring in that column. Perhaps better to remove them there. I’ve had the exact same problem copying the IPnr/hostname. Takes too much time to figure out something isn’t working due to those ZWSP characters.
Offline
Offline