Go to main content

Textpattern CMS support forum

You are not logged in. Register | Login | Help

#1 2017-08-31 06:18:07

colak
Admin
From: Cyprus
Registered: 2004-11-20
Posts: 9,054
Website GitHub Mastodon Twitter

htaccess hotlink protection

I am trying to have some hotlink protection for images but also allow images to be indexed by search engines. The code below does not seem to work as it blocks images everywhere. Is it possible for someone to tell me what I might be doing wrong?

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !.*a9.*$  [NC]
RewriteCond %{HTTP_REFERER} !.*altavista.*$ [NC]
RewriteCond %{HTTP_REFERER} !.*ask.*$ [NC]
RewriteCond %{HTTP_REFERER} !.*duckduckgo.*$ [NC]
RewriteCond %{HTTP_REFERER} !.*google.*$ [NC]
RewriteCond %{HTTP_REFERER} !.*yahoo.*$ [NC]
RewriteCond %{HTTP_REFERER} !.*bing.*$ [NC]
RewriteCond %{HTTP_REFERER} !.*baidu.*$ [NC]
RewriteCond %{HTTP_REFERER} !.*excite.*$ [NC]
RewriteCond %{HTTP_REFERER} !.*faganfinder.*$ [NC]
RewriteCond %{HTTP_REFERER} !.*ixquick.*$ [NC]
RewriteCond %{HTTP_REFERER} !.*microsofttranslator.*$ [NC]
RewriteCond %{HTTP_REFERER} !.*msn.*$ [NC]
RewriteCond %{HTTP_REFERER} !.*picsearch.*$ [NC]
RewriteCond %{HTTP_REFERER} !.*tineye.*$ [NC]
RewriteCond %{HTTP_REFERER} !.*wolframalpha.*$ [NC]
RewriteCond %{HTTP_REFERER} !.*yandex.*  [NC]
RewriteCond %{HTTP_REFERER} !^http(s)?://([^.]+\.)?goo\.ne\.jp/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http(s)?://([^.]+\.)?archive\.org/.*$ [NC]
RewriteRule .*\.(gif|jpg|png|svg)$ https://lh3.googleusercontent.com/-KDvIl3r2wdM/AAAAAAAAAAI/AAAAAAAAAAA/fZTmVSnqmxA/s120-c/photo.jpg [R,NC,L]
</ifModule>

Yiannis
——————————
NeMe | hblack.art | EMAP | A Sea change | Toolkit of Care
I do my best editing after I click on the submit button.

Offline

#2 2017-08-31 12:54:59

makss
Plugin Author
From: Ukraine
Registered: 2008-10-21
Posts: 355
Website

Re: htaccess hotlink protection

Add permission for your domain.

Minimal hotlink protection example:

RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?yourdomain.com.*$   [NC]
RewriteRule \.(jpg|jpeg|jpe|png|gif|svg)$ - [NC,F,L]

Last edited by makss (2017-08-31 12:55:32)


aks_cron : Cron inside Textpattern | aks_article : extended article_custom tag
aks_cache : cache for TxP | aks_dragdrop : Drag&Drop categories (article, link, image, file)

Offline

#3 2017-08-31 13:11:11

colak
Admin
From: Cyprus
Registered: 2004-11-20
Posts: 9,054
Website GitHub Mastodon Twitter

Re: htaccess hotlink protection

Thanks so much makss. the problem is domains like google with their numerous subdomains and extensions. I was looking for a wildcard so as to include all those possibilities. Would you know how to go about it for those?


Yiannis
——————————
NeMe | hblack.art | EMAP | A Sea change | Toolkit of Care
I do my best editing after I click on the submit button.

Offline

#4 2017-08-31 13:51:06

makss
Plugin Author
From: Ukraine
Registered: 2008-10-21
Posts: 355
Website

Re: htaccess hotlink protection

There is no complete list of “good domains”. Perhaps it is worth to go the other way, i.e. Block only the most malicious offenders, and allow the rest to hotlink. Can periodically view the log of the web server and block unnecessary domains.


aks_cron : Cron inside Textpattern | aks_article : extended article_custom tag
aks_cache : cache for TxP | aks_dragdrop : Drag&Drop categories (article, link, image, file)

Offline

#5 2017-09-23 21:16:34

ragnar
New Member
Registered: 2017-09-23
Posts: 2

Re: htaccess hotlink protection

Hello guys im a new guy here so i dont really want to start a new thread my quesion is very similar to that of cyprus but i need to block backlink bots like semrush and ahrefs ect ect from scanning my site , i have the following code :

#get rid of bad bots
RewriteEngine on
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} rogerbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} exabot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} MJ12bot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} dotbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} semrushbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} gigabot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} AhrefsBot [NC]
RewriteRule .* - [R=403,L] 

So while this should work the bots still bypass it , i dont know if my syntax is incorrect or if the bots are using fake user agent strings….. if that is the case i can try ip ranges? but there must be an easier way ive seen some of my competitors do it where i have no acess to anything of theirs any ideas?

Last edited by ruud (2017-09-23 21:44:52)

Offline

#6 2017-09-24 05:10:31

colak
Admin
From: Cyprus
Registered: 2004-11-20
Posts: 9,054
Website GitHub Mastodon Twitter

Re: htaccess hotlink protection

Hi ragnar and welcome to txp., You seem to be using the RewriteEngine on statement twice.

Or you could try without it all together.

For the RewriteRule try RewriteRule ^.* - [F,L]


Yiannis
——————————
NeMe | hblack.art | EMAP | A Sea change | Toolkit of Care
I do my best editing after I click on the submit button.

Offline

#7 2017-09-26 10:45:09

ragnar
New Member
Registered: 2017-09-23
Posts: 2

Re: htaccess hotlink protection

Thanks for the Edit Ruud .. :)

The second line was rewitebase /
Also i have tried RewriteRule .* – [R=403,L] …..
and RewriteRule ^(.*)$ http://webmd.com/ [L,R=301] ….
Now the funny thing is if i add googlebot to the list it works when i try fetch from google webmaster tools this leads me to the following assumption.
-The bots arent following the rules because they arent identifying as the user agent strings we set out .
-You can use ip ranges but that also changes and is a long list that id be scared to block something important.
-Databases are also kept on you results so its possible if a bot senses a 301 ( yes it can ) or 403 it will pull up the database for the person searching.
- I see WordPress has two plugins spyder spanker and link privacy that are updated databases on these bots i have found nothing like that for non WordPress websites.

Lastly ive seen somebody go to the extreme by something called a black hole php in which they track and a trap bots crawling , but this doesn’t solve my problem as your links are indexed before you can block em .

The only way i see people actually doing it is redirecting links from other domains they own to their money site.

my 2cents from what ive researched thusfar hope it helps, you are still being crawled with that file trust me , semrush and ahrefs bots can even use googlebot as a user agent string if they want.

Last edited by ragnar (2017-09-26 10:47:04)

Offline

Board footer

Powered by FluxBB