Textpattern CMS support forum
You are not logged in. Register | Login | Help
- Topics: Active | Unanswered
#31 2005-02-05 02:25:26
- zem
- Developer Emeritus

- From: Melbourne, Australia
- Registered: 2004-04-08
- Posts: 2,579
Re: Automatic Referral spam Blocking.
> What is this, “key challenge” ? Is this something user-input based?
Nope. It’s a trap (one of several) designed to automatically detect referrer spambots. No user or admin intervention required.
> Also, from that log, it looks as though you’re blacklisting IPs? Why? Those change way to frequently.
1. The blacklist is short term only (6 hours by default), and requires multiple entries before an IP is blocked.
2. Blocking based on URLs is dangerous – a malicious user could deliberately refer-spam someone else’s URL, and cause legitimate traffic to be blocked.
> Defer is mostly automatic, but like any automatic “machine”, it will need adjusting occasionally to catch common strings.
Mine (“Dereference”) is set-and-forget. No intervention required after installation.
> I’m ultimately aiming for a method to cut down on bandwidth/processing, and the initial check is whether anything in the referring agent or get/post matches what is in the list and then promptly issuing a 404/exit()
404 sounds like a risky choice. There are more appropriate error messages.
> I welcome any input zem may offer as this ultimately is not for (at least for me) any monetary good, but a tool to combat spam.
Spam is economics. It’s all about money.
Alex
Offline
Re: Automatic Referral spam Blocking.
> 2. Blocking based on URLs is dangerous – a malicious user could
> deliberately refer-spam someone else’s URL, and cause legitimate
> traffic to be blocked.
I anticipated this and am planning an exception list (ie. google, google crawler, yahoo slurp, etc)
> 404 sounds like a risky choice. There are more appropriate error
> messages.
Explain. How exactly is it a risky choice?
> Spam is economics. It’s all about money.
This we’ll disagree on. I’m inclined to believe that you’re trying to thwart any affect a free alternative will have vs. ransomware.
What happens when your traps are thwarted? Are you going to come out with a new version and charge for that? I’m aiming for something that can expand as necessary (ie, keywords)
Online
#33 2005-02-05 02:55:09
- zem
- Developer Emeritus

- From: Melbourne, Australia
- Registered: 2004-04-08
- Posts: 2,579
Re: Automatic Referral spam Blocking.
> I anticipated this and am planning an exception list (ie. google, google crawler, yahoo slurp, etc)
What about when the Evil Right Wing Blogs start refer-spamming the URLs of Evil Left Wing Blogs, and vice versa? Your exception list could get rather long.
> Explain. How exactly is it a risky choice?
Legitimate bots and user agents sometimes take action based on HTTP status codes. If you accidentally serve up a 404 to a search spider or blog indexer that’s not on your list, you could wind up de-listing yourself from a search engine or whatever. Or fooling a link checker bot, or misleading a user. Better to serve up an informative error so that legitimate bots and users can take appropriate action.
> This we’ll disagree on. I’m inclined to believe that you’re trying to thwart any affect a free alternative will have vs. ransomware.
Exactly what am I thwarting, and how?
Last edited by zem (2005-02-05 03:00:25)
Alex
Offline
Re: Automatic Referral spam Blocking.
>Legitimate bots and user agents sometimes take action based on >HTTP status codes. If you accidentally serve up a 404 to a search >spider or blog indexer that’s not on your list, you could wind up >de-listing yourself from a search engine or whatever. Or fooling a link >checker bot, or misleading a user. Better to serve up an informative >error so that legitimate bots and users can take appropriate action.
Defer only acts when there is a refererrer. Google/spiders/crawlers don’t spider websites using refers. Delisting? Thats a silly thought (FUD)
The thought that defer may throw a false spam positive has been taken into consideration, display text stating why and how to correct it for every 404 it displays.
I’m not going to believe that any simple trap is going to catch a spammer, anything popular enough will be exploited and a weakness will be found. Your traps will be affective for only such a short period of time before someone smarter than you figures out how to exploit it, or bypass it.
The most affective means has proven to be keywords and heuristics, of course, because of $ reaons, you will disagree with me. Can’t say I think to highly of your offering “deference” as ransom ware.
Online
#35 2005-02-05 03:37:09
- zem
- Developer Emeritus

- From: Melbourne, Australia
- Registered: 2004-04-08
- Posts: 2,579
Re: Automatic Referral spam Blocking.
> Defer only acts when there is a refererrer. Google/spiders/crawlers don’t spider websites using refers. Delisting? Thats a silly thought (FUD)
blo.gs does. Also some link checkers, RSS aggregators, etc.
> I’m not going to believe that any simple trap is going to catch a spammer, anything popular enough will be exploited and a weakness will be found. Your traps will be affective for only such a short period of time before someone smarter than you figures out how to exploit it, or bypass it.
Right, cos I hadn’t thought of that.
> The most affective means has proven to be keywords and heuristics,
How many successful keyword based spam filters or web content filters are there?
Alex
Offline
#36 2005-02-05 03:40:54
- zem
- Developer Emeritus

- From: Melbourne, Australia
- Registered: 2004-04-08
- Posts: 2,579
Re: Automatic Referral spam Blocking.
> zem wrote:
> > Defer only acts when there is a refererrer. Google/spiders/crawlers don’t spider websites using refers. Delisting? Thats a silly thought (FUD)
blo.gs does. Also some link checkers, RSS aggregators, etc.
> I’m not going to believe that any simple trap is going to catch a spammer, anything popular enough will be exploited and a weakness will be found. Your traps will be affective for only such a short period of time before someone smarter than you figures out how to exploit it, or bypass it.
Right, cos I hadn’t thought of that.
> The most affective means has proven to be keywords and heuristics,
How many successful keyword based spam filters or web content filters are there? (“heuristics” is a bit misleading – any kind of filter could be described as heuristic)
Last edited by zem (2005-02-05 03:41:21)
Alex
Offline
Re: Automatic Referral spam Blocking.
One other thing you might try, if it comes to that, is reverse DNS lookups on the referring domain; a lot of times the domain doesn’t exist at the time of the spamming, and is only put up later.
You cooin’ with my bird?
Offline
#38 2005-02-05 08:39:18
- Andrew
- Plugin Author

- Registered: 2004-02-23
- Posts: 730
Re: Automatic Referral spam Blocking.
Is anyone else bothered by TheEric’s constant badgering of zem? I am. Alex has contributed greatly to the Textpattern community and while this is a good debate, your instistance on callous remarks relating to ‘ransomware’ have little to do with it. I for one would appreciate it if you’d stay on-topic, rather than try and be a defamer.
Offline
Re: Automatic Referral spam Blocking.
I have no problem with TheEric making a positive contribution. He does seem to have an issue with ransomware.
Zem has contributed GREATLY to this both this forum and the txp community. (BTW, thanks zem)
I would like to see the current discussion continue, without the snide remarks about ransomware (as it is a seperate issue entirely).
Ben.
Life is what you make it… if nothing changes, nothing changes.
Web hosting http://dynamicwebhosting.com.au/
Web dev & marketing http://wallishamilton.com/
Offline
Re: Automatic Referral spam Blocking.
TheEric: keywords lead to the whack-a-mole game (which is why when I pointed out one such solution I appended a comment about how it can’t possibly scale); you ban ‘viagra’ and start seeing ‘v14gr4’. As for heuristics I’m not sure what you’re talking about; are you wanting something like Bayesian referer filtering (which probably wouldn’t be effective in this situation for a couple of reasons)?
And ditto the comments about laying off the ransomware; it’s actually very much in tune with the spirit of Free software (i.e., rather than paying for each and every copy of the software, you pay the programmer once to write it and then it’s available freely).
Last edited by ubernostrum (2005-02-05 13:43:03)
You cooin’ with my bird?
Offline
Re: Automatic Referral spam Blocking.
To TheEric and Zem: nevermind the antagonism, shouldn’t you two take this to e-mail rather than spill your guts here in a manner that will be indexed by Google for the future research efforts of your opponents?
You know, the spammers?
I’m thankful you are both working on this problem. But this is not a competition. Not from our point of view.
TextPattern user since 04/04/04
Offline
#42 2005-02-05 16:58:42
- Andrew
- Plugin Author

- Registered: 2004-02-23
- Posts: 730
Re: Automatic Referral spam Blocking.
To the death, I say, TO THE DEATH! There’s gotta be a few extra poking sticks around here somewhere…
Offline
Re: Automatic Referral spam Blocking.
reid: I’m of the opinion that anti-spam techniques are like cryptography; both will benefit more from open discussion than from excessive secrecy. And, really, I’d rather have a solution that will work even when the spammers know about it than have an arms race of “secret weapons” where the current generation becomes obsolete as soon as somebody on the other side finds out about it.
You cooin’ with my bird?
Offline
#44 2005-02-06 00:39:38
- zem
- Developer Emeritus

- From: Melbourne, Australia
- Registered: 2004-04-08
- Posts: 2,579
Re: Automatic Referral spam Blocking.
> reid: I’m of the opinion that anti-spam techniques are like cryptography; both will benefit more from open discussion than from excessive secrecy. And, really, I’d rather have a solution that will work even when the spammers know about it than have an arms race of “secret weapons” where the current generation becomes obsolete as soon as somebody on the other side finds out about it.
That’s an important principle, but such security is only possible when the entire protocol has been designed that way from the ground up. The current state of spam (particularly referrer spam) is fought on the spammers’ terms, not ours.
The necessary conditions for building an anti-spam mechanism based on long-term security do not yet exist, and perhaps never will.
All we can do for now is ensure that we set the terms, rather than continuing to fight on theirs. (That’s why I think the keyword approach will fail: the spammers want us to fight that battle, because they know they can win – they already have, in email).
Alex
Offline
Re: Automatic Referral spam Blocking.
No automatic detection (keyword or trap) will succeed 100%
Trap vulnerabilities will be discovered and thwarted.
Keywords will be changed and yes, whack-a-mole would develop if that were the sole method..
The only thing that will remain constant is that:- Spammers will not play nice. They will not just hit you once. They’re greedy lil snots. Ethics and Morals aren’t in their vocabulary.
- Spammers don’t have time – They are in a rush. They will not visit more than one page at a time on your website as it costs them links per hour.
- Spammers will hit as many websites as possible.
- Spamming is automatic, spending “real user” time on a website costs them time.
Assuming most of these variables, the rate in which they escape and sap bandwidth (my primary concern) can be controlled. It won’t be perfect, but it will help.
Using my version (still not stable enough for release), I’ve cut the spam traffic to my website by a significant margin. I am not naive enough to think I can stop every single spammer, but I can cut it to a dull throb.
—
It would be nice if a hunting season could be declared on spammers :)
Last edited by TheEric (2005-02-06 01:23:45)
Online