Textpattern CMS support forum
You are not logged in. Register | Login | Help
- Topics: Active | Unanswered
Image leeching and splogging deterence
Is this a good reason to not offer a RSS feed? (In the useless effort to bring that trend back.)
Last edited by Destry (2020-10-15 11:41:07)
Offline
Re: Image leeching and splogging deterence
It is not really reason to not offer a RSS feed. What many people do is offer an ‘excerpt’ only feed, with a link at the end to the permanent location of the article. I even suspect it is a default for WP sites, not sure.
(BTW, I am sure you notice, but that article is rather old…)
Where is that emoji for a solar powered submarine when you need it ?
Sand space – admin theme for Textpattern
Offline
Re: Image leeching and splogging deterence
What I was gleaning there was that garbage sites like the one he was describing, ‘Google Chrome Browser’, exist only to scrape the content published somewhere else, and that they were using RSS feeds to do it (though I doubt that’s the only way scrapers work, don’t know).
So my concern is one from a content owner not wanting to be scraped and leeched on, and if what the author was saying — sending bogus images to the leech — was at least one trick to make that site look like the leechy suckface it is.
The next question being, how do you do that on Apache, er, in context of txp? Or is it even worth bothering with, even if one is not inclined to let leeches suck freely?
Last edited by Destry (2020-10-15 12:21:09)
Offline
Re: Image leeching and splogging deterence
Destry wrote #326414:
The next question being, how do you do that on Apache, er, in context of txp? Or is it even worth bothering with, even if one is not inclined to let leeches suck freely?
I don’t know a way of stopping them – especially those bastards with more-than-average motivation, but you can slow them down.
Hotlinking is largely a solved problem – you can restrict your server to only serve images to a given subset of IPs or domains, and user agents at a push…but you’re effectively knackering RSS readers at that point. There are far fewer than there used to be, granted, but I love my NetNewsWire, and I’ll defend RSS as a delivery medium. You can do various hotlinking prevention things in .htaccess
, search for ‘apache hotlinking’ and get a daytime beverage for some r&d. That thing you talk about with image replacement is fine, but bear in mind that’s an image being served up from your server…so you’re still serving the traffic in a fashion. Stopping it a web server level (403, 404, whatever – choose your weapon).
You could possibly do client rate limiting with Apache, though I don’t touch it these days so I can’t say for certain. Likewise, a web application firewall (WAF) like mod_security or similar might have some stuff in it to slow clients down. Typical WAF config is quite complex, so upgrade that beverage to an evening or extra curricular one be ready to bail.
When you make stuff worth stealing, it’ll be stolen. Find peace with that, if you can.
Offline
Re: Image leeching and splogging deterence
PS: scrapers won’t give a hoot about RSS/Atom/whatever, they’re just weaponised curl
bots or headless browsers – the endpoint delivery doesn’t matter.
Offline
Re: Image leeching and splogging deterence
Destry wrote #326414:
Or is it even worth bothering with, even if one is not inclined to let leeches suck freely?
Do you think those site get real human being visitors ? most of their visitors are probably bots or poor pay-to-click temp jobs.
And as Pete says, most content fetching is done via fake browsers-wrappers-for curl
. And most of those “sites” are probably only up for a few months at most.
Where is that emoji for a solar powered submarine when you need it ?
Sand space – admin theme for Textpattern
Offline
Re: Image leeching and splogging deterence
Offline
Re: Image leeching and splogging deterence
…. texted postive
Offline
Re: Image leeching and splogging deterence
phiw13 wrote #326419:
Do you think those site get real human being visitors ? most of their visitors are probably bots or poor pay-to-click temp jobs.
And as Pete says, most content fetching is done via fake browsers-wrappers-for
curl
. And most of those “sites” are probably only up for a few months at most.
If they are so transitory, how do they get above the real blog in serps and get 25k twittr followers? If that’s the case, these sites will grow :-( It’s not nice being ripped off, but if the content is seen by more people at least that’s some satisfaction, even if you don’t get any credit for it.
I transcribed Home-life of the Lancashire Factory Folk during the Cotton Famine by Edwin Waugh in 2002 or 2003 from a rare book in my local records office, and put it online on my family history site and I also put it on Project Gutenberg. Do a search for the title now and you’ll find it’s been made into books and published online in various places, but you won’t find my site or any acknowledgement of my transcription because the thieves were better at SEO than me. It’s a slightly different case because it was not my content, only my transcription, but I spent many hours doing it so am a little bit miffed 99.9% of people will never know I made it it public, but it’s a great book and it’s nice to think many people have read it now who would have otherwise probably never known about.
Many great things have been made by Anon, but unfortunately the thieves of this world rip them off and make money from them…
Offline
Re: Image leeching and splogging deterence
zero wrote #326432:
I transcribed Home-life of the Lancashire Factory Folk during the Cotton Famine by Edwin Waugh in 2002 or 2003 from a rare book in my local records office, and put it online on my family history site and I also put it on Project Gutenberg.
It sucks to be ripped off like that. I have added three books on the history of my home town in Italy. I suppose one day those might be ripped off, which would irate me big time.
By the way did you use TxP to build your site?
…. texted postive
Offline
Re: Image leeching and splogging deterence
bici wrote #326445:
It sucks to be ripped off like that. I have added three books on the history of my home town in Italy. I suppose one day those might be ripped off, which would irate me big time.
By the way did you use TxP to build your site?
Yes, always Textpattern. Your books too will be on Amazon before long, bici, it’s the way of the world.
Offline
Re: Image leeching and splogging deterence
zero wrote #326447:
Yes, always Textpattern.
Amazing site, Peter. But something has happened to Newer/Older links, they point to the current page.
Offline