Go to main content

Textpattern CMS support forum

You are not logged in. Register | Login | Help

#1 2015-01-26 11:08:47

txpwayoflife
Member
Registered: 2015-01-09
Posts: 16

How do I keep robots away ?

Hi!

If you visit my bbclone installation, you will see that there are a lot of robots:

http://www.alexandrealmeida.blue/bbclone/show_detailed.php?lng=en

How can I stop them?

Thank you.

Last edited by txpwayoflife (2015-01-26 11:10:35)

Offline

#2 2015-01-26 11:29:35

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 12,463
Website GitHub

Re: How do I keep robots away ?

I’m not familiar with BBClone at all but if a standard robots.txt file doesn’t help, you could perhaps try searching the BBClone forum and see if anyone has some advice there.


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Hire Txp Builders – finely-crafted code, design and Txp

Offline

#3 2015-01-26 11:31:47

gaekwad
Server grease monkey
From: People's Republic of Cornwall
Registered: 2005-11-19
Posts: 4,737
GitHub

Re: How do I keep robots away ?

It’s possible the Linode robots are from ifttt or another change-polling service; I get a bunch hitting my RSS and Atom feeds, and it’s no different to search engine spiders, really. Can you check your access logs and see what URL(s) they are checking?

Edit:

I’ve done a copy + paste below from my visitor logs on the RSS/Atom site:

	Time	Host	Page	Referrer
	26 Jan 2015 12:25:00	41.146.199.146.dyn.plus.net		 
	26 Jan 2015 12:23:54	li553-92.members.linode.com	rss	 
	26 Jan 2015 12:16:01	li552-158.members.linode.com	rss	 
	26 Jan 2015 12:08:05	li553-28.members.linode.com	rss	 
	26 Jan 2015 12:00:12	li490-209.members.linode.com	rss	 
	26 Jan 2015 11:54:48	crawl-66-249-78-202.googlebot.com	articles	 
	26 Jan 2015 11:52:20	li529-215.members.linode.com	rss	 
	26 Jan 2015 11:45:08	8.29.198.33		http://www.feedly.com
	26 Jan 2015 11:44:28	li529-215.members.linode.com	rss	 
	26 Jan 2015 11:39:46	google-proxy-66-249-81-215.google.com		 
	26 Jan 2015 11:36:35	li553-114.members.linode.com	rss	 
	26 Jan 2015 11:28:40	li552-158.members.linode.com	rss	 
	26 Jan 2015 11:20:47	li553-36.members.linode.com	rss	 
	26 Jan 2015 11:12:53	li552-117.members.linode.com	rss	 
	26 Jan 2015 11:06:27	dd20328.kasserver.com		 
	26 Jan 2015 11:04:58	li532-82.members.linode.com	rss	 
	26 Jan 2015 10:57:04	li552-95.members.linode.com	rss	 
	26 Jan 2015 10:54:50	crawl-66-249-78-195.googlebot.com
	26 Jan 2015 10:49:12	li553-160.members.linode.com	rss	 
	26 Jan 2015 10:41:16	li552-95.members.linode.com	rss	 
	26 Jan 2015 10:33:24	li529-215.members.linode.com	rss	 
	26 Jan 2015 10:25:48	41.146.199.146.dyn.plus.net	one-thousand-days	 
	26 Jan 2015 10:25:31	li552-96.members.linode.com	rss	 
	26 Jan 2015 10:17:39	li532-82.members.linode.com	rss	 
	26 Jan 2015 10:13:56	110.82.166.96		 
	26 Jan 2015 10:13:37	65.19.138.33	atom	 
	26 Jan 2015 10:09:47	li552-155.members.linode.com	rss	 
	26 Jan 2015 10:01:54	li553-91.members.linode.com	rss	 
	26 Jan 2015 09:53:59	li552-155.members.linode.com	rss	 
	26 Jan 2015 09:46:04	li552-155.members.linode.com	rss	 
	26 Jan 2015 09:42:27	65.19.138.34	atom	 
	26 Jan 2015 09:41:32	crawl-66-249-78-188.googlebot.com	articles	 
	26 Jan 2015 09:40:53	static.117.65.9.5.clients.your-server.de		 
	26 Jan 2015 09:38:12	li553-115.members.linode.com	rss	 
	26 Jan 2015 09:30:18	li553-102.members.linode.com	rss	 
	26 Jan 2015 09:22:25	li553-91.members.linode.com	rss	 
	26 Jan 2015 09:14:30	li552-117.members.linode.com	rss	 
	26 Jan 2015 09:06:37	li495-23.members.linode.com	rss	 
	26 Jan 2015 08:58:45	li553-115.members.linode.com	rss	 
	26 Jan 2015 08:50:52	li553-160.members.linode.com	rss	 
	26 Jan 2015 08:45:18	crawl-66-249-78-195.googlebot.com
	26 Jan 2015 08:42:59	li553-91.members.linode.com	rss	 
	26 Jan 2015 08:35:06	li553-92.members.linode.com	rss	 
	26 Jan 2015 08:27:11	li552-155.members.linode.com	rss	 
	26 Jan 2015 08:19:15	li553-91.members.linode.com	rss	 
	26 Jan 2015 08:11:21	li552-158.members.linode.com	rss	 
	26 Jan 2015 08:03:25	li553-92.members.linode.com	rss	 
	26 Jan 2015 07:55:30	li495-23.members.linode.com	rss	 
	26 Jan 2015 07:54:50	msnbot-157-55-39-52.search.msn.com		 
	26 Jan 2015 07:47:35	li553-115.members.linode.com	rss	 
	26 Jan 2015 07:43:48	crawl-66-249-78-195.googlebot.com
	26 Jan 2015 07:39:41	li553-91.members.linode.com	rss	 
	26 Jan 2015 07:31:46	li553-102.members.linode.com	rss	 
	26 Jan 2015 07:23:51	li495-23.members.linode.com	rss	 
	26 Jan 2015 07:15:57	li553-114.members.linode.com	rss	 
	26 Jan 2015 07:13:35	65.19.138.34	atom	 
	26 Jan 2015 07:08:02	li553-36.members.linode.com	rss	 
	26 Jan 2015 07:00:08	li553-36.members.linode.com	rss	 
	26 Jan 2015 06:55:38	crawl-66-249-64-9.googlebot.com		 
	26 Jan 2015 06:52:15	li553-115.members.linode.com	rss	 
	26 Jan 2015 06:44:20	li553-114.members.linode.com	rss	 
	26 Jan 2015 06:36:25	li495-23.members.linode.com	rss	 
	26 Jan 2015 06:28:33	li529-215.members.linode.com	rss	 
	26 Jan 2015 06:25:27	spider-5-255-253-165.yandex.com		 
	26 Jan 2015 06:20:41	li552-49.members.linode.com	rss	 
	26 Jan 2015 06:12:46	li553-114.members.linode.com	rss	 
	26 Jan 2015 06:04:54	li552-155.members.linode.com	rss	 
	26 Jan 2015 06:03:19	192.81.215.245		 
	26 Jan 2015 05:57:02	li553-28.members.linode.com	rss	 
	26 Jan 2015 05:56:30	117.26.255.155		 
	26 Jan 2015 05:49:11	li552-95.members.linode.com	rss	 
	26 Jan 2015 05:46:34	crawl-66-249-64-1.googlebot.com
	26 Jan 2015 05:44:43	180.76.6.150	good-not-quick	 
	26 Jan 2015 05:41:20	li552-96.members.linode.com	rss	 
	26 Jan 2015 05:33:27	li552-158.members.linode.com	rss	 
	26 Jan 2015 05:25:31	li532-82.members.linode.com	rss	 
	26 Jan 2015 05:09:40	li552-158.members.linode.com	rss	 
	26 Jan 2015 05:04:26	crawl-66-249-78-202.googlebot.com
	26 Jan 2015 05:01:47	li553-102.members.linode.com	rss	 
	26 Jan 2015 04:59:58	crawl-66-249-78-195.googlebot.com
	26 Jan 2015 04:54:49	crawl-66-249-78-188.googlebot.com
	26 Jan 2015 04:53:56	li553-158.members.linode.com	rss	 
	26 Jan 2015 04:51:43	220.181.125.200		 
	26 Jan 2015 04:46:04	li529-215.members.linode.com	rss	 
	26 Jan 2015 04:38:10	li553-28.members.linode.com	rss	 
	26 Jan 2015 04:30:16	li553-91.members.linode.com	rss	 
	26 Jan 2015 04:22:24	li553-158.members.linode.com	rss	 
	26 Jan 2015 04:16:52	google-proxy-66-249-84-223.google.com		 
	26 Jan 2015 04:14:29	li553-102.members.linode.com	rss	 
	26 Jan 2015 04:13:32	65.19.138.33	atom	 
	26 Jan 2015 04:06:37	li552-156.members.linode.com	rss	 
	26 Jan 2015 04:06:14	206.253.226.23		 
	26 Jan 2015 04:06:14	206.253.226.23		 
	26 Jan 2015 03:58:42	li490-209.members.linode.com	rss	 
	26 Jan 2015 03:50:49	li552-49.members.linode.com	rss	 
	26 Jan 2015 03:42:57	li552-49.members.linode.com	rss	 
	26 Jan 2015 03:42:21	65.19.138.34	atom	 
	26 Jan 2015 03:35:00	li532-82.members.linode.com	rss	 
	26 Jan 2015 03:27:05	li553-158.members.linode.com	rss	 
	26 Jan 2015 03:19:12	li552-96.members.linode.com	rss

The formatting isn’t great, but you get the idea.

Last edited by gaekwad (2015-01-26 12:30:33)

Offline

#4 2015-01-26 12:24:21

ruud
Developer Emeritus
From: a galaxy far far away
Registered: 2006-06-04
Posts: 5,068
Website

Re: How do I keep robots away ?

Also look at the User-agent string in the access logs. That should give an idea about which kind of software is doing these requests.

Offline

#5 2015-01-26 14:33:02

uli
Moderator
From: Cologne
Registered: 2006-08-15
Posts: 4,316

Re: How do I keep robots away ?

I’ll just relocate this where it should be, don’t worry.


In bad weather I never leave home without wet_plugout, smd_where_used and adi_form_links

Offline

#6 2015-01-26 17:08:05

txpwayoflife
Member
Registered: 2015-01-09
Posts: 16

Re: How do I keep robots away ?

Hi Bloke, Gaekwad and Ruud !

Bloke, I’ve tryied to add a file “robots.txt” as it is seen in this page:
http://www.robotstxt.org/faq/prevent.html
But it didn’t work…(I’ve upload it to the root directory and subdomains)

Hello, Gaekwad. Here it is, I will copy and paste:

26 Jan 2015 14:49:58  198.58.103.28 textpattern/rss
26 Jan 2015 14:42:02  198.58.103.102 textpattern/rss
26 Jan 2015 14:34:07  198.58.99.82 textpattern/rss
26 Jan 2015 14:26:13  198.58.102.96 textpattern/rss
26 Jan 2015 14:18:20  198.58.103.114 textpattern/rss
26 Jan 2015 14:10:25  li552-95.members.linode.com textpattern/rss
26 Jan 2015 14:02:32  198.58.103.28 textpattern/rss
26 Jan 2015 13:54:40  198.58.102.158 textpattern/rss
26 Jan 2015 13:46:44  198.58.102.96 textpattern/rss
26 Jan 2015 13:38:51  198.58.103.102 textpattern/rss
26 Jan 2015 13:30:58  li490-209.members.linode.com textpattern/rss
26 Jan 2015 13:23:06  198.58.99.82 textpattern/rss
26 Jan 2015 13:15:12  198.58.102.96 textpattern/rss
26 Jan 2015 13:07:18  198.58.103.92 textpattern/rss
26 Jan 2015 12:59:25  li552-117.members.linode.com textpattern/rss
26 Jan 2015 12:51:29  198.58.103.114 textpattern/rss
26 Jan 2015 12:43:36  198.58.103.92 textpattern/rss
26 Jan 2015 12:35:42  198.58.102.156 textpattern/rss
26 Jan 2015 12:27:49  li529-215.members.linode.com textpattern/rss
26 Jan 2015 12:19:54  li552-95.members.linode.com textpattern/rss
26 Jan 2015 12:11:59  li552-49.members.linode.com textpattern/rss
26 Jan 2015 12:04:07  li490-209.members.linode.com textpattern/rss
26 Jan 2015 11:56:12  198.58.102.158 textpattern/rss
26 Jan 2015 11:48:16  198.58.103.36 textpattern/rss
26 Jan 2015 11:40:21  li552-117.members.linode.com textpattern/rss
26 Jan 2015 11:32:26  198.58.103.28 textpattern/rss
26 Jan 2015 11:24:34  li553-91.members.linode.com textpattern/rss
26 Jan 2015 11:16:42  198.58.102.156 textpattern/rss
26 Jan 2015 11:08:46  li553-91.members.linode.com textpattern/rss
26 Jan 2015 11:00:51  li490-209.members.linode.com textpattern/rss
26 Jan 2015 10:52:55  li553-158.members.linode.com textpattern/rss
26 Jan 2015 10:44:59  li529-215.members.linode.com textpattern/rss
26 Jan 2015 10:37:07  198.58.103.28 textpattern/rss
26 Jan 2015 10:29:14  198.58.103.36 textpattern/rss
26 Jan 2015 10:21:19  li552-95.members.linode.com textpattern/rss
26 Jan 2015 10:13:24  198.58.103.92 textpattern/rss
26 Jan 2015 10:05:33  198.58.102.158 textpattern/rss
26 Jan 2015 09:57:38  li553-91.members.linode.com textpattern/rss
26 Jan 2015 09:49:45  li529-215.members.linode.com textpattern/rss
26 Jan 2015 09:41:51  198.58.103.92 textpattern/rss
26 Jan 2015 09:33:59  198.58.103.36 textpattern/rss
26 Jan 2015 09:25:58  198.58.102.156 textpattern/rss
26 Jan 2015 09:18:05  li490-209.members.linode.com textpattern/rss
26 Jan 2015 09:10:13  198.58.102.156 textpattern/rss
26 Jan 2015 09:02:16  198.58.102.155 textpattern/rss
26 Jan 2015 08:54:21  198.58.99.82 textpattern/rss
26 Jan 2015 08:46:24  198.58.103.114 textpattern/rss
26 Jan 2015 08:38:32  198.58.103.115 textpattern/rss
26 Jan 2015 08:30:41  198.58.102.96 textpattern/rss
26 Jan 2015 08:24:36  ec2-50-17-173-39.compute-1.amazonaws.com textpattern/pesquisar
26 Jan 2015 08:24:36  ec2-50-17-173-39.compute-1.amazonaws.com textpattern
26 Jan 2015 08:22:46  198.58.102.156 textpattern/rss
26 Jan 2015 08:14:54  li553-158.members.linode.com textpattern/rss
26 Jan 2015 08:07:01  198.58.102.155 textpattern/rss
26 Jan 2015 07:59:05  li490-209.members.linode.com textpattern/rss
26 Jan 2015 07:51:13  li552-49.members.linode.com textpattern/rss
26 Jan 2015 07:43:18  li553-158.members.linode.com textpattern/rss
26 Jan 2015 07:35:25  li552-117.members.linode.com textpattern/rss
26 Jan 2015 07:27:32  198.58.102.155 textpattern/rss
26 Jan 2015 07:19:38  198.58.102.96 textpattern/rss
26 Jan 2015 07:11:44  198.58.103.28 textpattern/rss
26 Jan 2015 07:03:49  li553-160.members.linode.com textpattern/rss
26 Jan 2015 06:55:54  198.58.103.28 textpattern/rss
26 Jan 2015 06:48:03  li552-95.members.linode.com textpattern/rss
26 Jan 2015 06:40:09  li553-160.members.linode.com textpattern/rss
26 Jan 2015 06:32:17  50.116.30.23 textpattern/rss
26 Jan 2015 06:24:21  198.58.103.102 textpattern/rss
26 Jan 2015 06:16:26  198.58.103.102 textpattern/rss
26 Jan 2015 06:08:33  li552-95.members.linode.com textpattern/rss
26 Jan 2015 06:04:38  crawl-66-249-79-27.googlebot.com textpattern/articles/54/NoiteFeliztocadapelabandareal
26 Jan 2015 06:00:39  198.58.102.155 textpattern/rss
26 Jan 2015 05:52:43  li552-95.members.linode.com textpattern/rss
26 Jan 2015 05:44:49  198.58.99.82 textpattern/rss
26 Jan 2015 05:36:54  198.58.103.114 textpattern/rss
26 Jan 2015 05:29:01  li553-158.members.linode.com textpattern/rss
26 Jan 2015 05:21:08  li553-91.members.linode.com textpattern/rss
26 Jan 2015 05:13:17  198.58.103.28 textpattern/rss
26 Jan 2015 05:05:23  li553-158.members.linode.com textpattern/rss
26 Jan 2015 04:57:30  li552-95.members.linode.com textpattern/rss
26 Jan 2015 04:49:38  198.58.103.36 textpattern/rss
26 Jan 2015 04:41:46  li529-215.members.linode.com textpattern/rss
26 Jan 2015 04:33:52  li552-117.members.linode.com textpattern/rss
26 Jan 2015 04:25:58  198.58.103.102 textpattern/rss
26 Jan 2015 04:18:02  50.116.30.23 textpattern/rss
26 Jan 2015 04:15:05  crawl-66-249-79-27.googlebot.com textpattern/articles/60/ComoadicionarobotaodoFacebookparacompartilharseusartigosnoTextpattern
26 Jan 2015 04:10:09  198.58.99.82 textpattern/rss
26 Jan 2015 04:02:17  198.58.102.156 textpattern/rss
26 Jan 2015 03:54:22  198.58.103.28 textpattern/rss
26 Jan 2015 03:46:24  198.58.103.28 textpattern/rss
26 Jan 2015 03:38:29  li552-95.members.linode.com textpattern/rss
26 Jan 2015 03:31:09  crawl-66-249-79-43.googlebot.com textpattern
26 Jan 2015 03:30:33  li553-160.members.linode.com textpattern/rss
26 Jan 2015 03:22:38  li552-95.members.linode.com textpattern/rss
26 Jan 2015 03:14:46  198.58.103.114 textpattern/rss
26 Jan 2015 03:06:51  li553-160.members.linode.com textpattern/rss
26 Jan 2015 02:58:58  198.58.102.155 textpattern/rss
26 Jan 2015 02:51:02  50.116.30.23 textpattern/rss
26 Jan 2015 02:43:10  li553-160.members.linode.com textpattern/rss
26 Jan 2015 02:35:17  li490-209.members.linode.com textpattern/rss
26 Jan 2015 02:27:25  li553-158.members.linode.com textpattern/rss

And ruud, here is your answer:

Superfeedr bot/2.0 http://superfeedr.com – Make your feeds realtime: get in touch

Well, I will forget this, I think nothing can stop these robots…

Thank you for all the help!

{ Moderator’s annotation: Tried to add bc. for better readability but no worka. (Ah: spaces at the start of each line!) – Uli }

Last edited by uli (2015-01-26 17:48:17)

Offline

#7 2015-01-26 18:35:02

colak
Admin
From: Cyprus
Registered: 2004-11-20
Posts: 9,388
Website GitHub Mastodon Twitter

Re: How do I keep robots away ?

There are a lot of strange things happening lately. One of my sites has for months been bombarded from many ips (many from china). I try to ignore them but they are eating up a lot of bandwidth.


Yiannis
——————————
NeMe | hblack.art | EMAP | A Sea change | Toolkit of Care
I do my best editing after I click on the submit button.

Offline

#8 2015-01-27 09:14:58

ruud
Developer Emeritus
From: a galaxy far far away
Registered: 2006-06-04
Posts: 5,068
Website

Re: How do I keep robots away ?

This would also work (in .htaccess), but breaks the service at superfeedr.com

RewriteCond %{HTTP_USER_AGENT} superfeedr.com
RewriteRule ^(.*)$ http://go.away/

Offline

#9 2015-01-27 13:07:18

txpwayoflife
Member
Registered: 2015-01-09
Posts: 16

Re: How do I keep robots away ?

Hi ruud, thank you for your tip. I will try it later, when I get home!

Offline

#10 2015-01-27 14:02:22

txpwayoflife
Member
Registered: 2015-01-09
Posts: 16

Re: How do I keep robots away ?

I’ve tried to upload a .htaccsees file but the system returns an error saying the file already exists… Godaddy hosts my website, and I don’t know how to see the hidden files… and the filezilla never connects to my account, I always use the cPanel to upload a file…

Last edited by txpwayoflife (2015-01-27 14:02:45)

Offline

#11 2015-01-27 14:38:44

ruud
Developer Emeritus
From: a galaxy far far away
Registered: 2006-06-04
Posts: 5,068
Website

Re: How do I keep robots away ?

You have to download the existing .htaccess file (TXP needs that) and add the two lines I posted.

Offline

#12 2015-01-27 17:53:47

txpwayoflife
Member
Registered: 2015-01-09
Posts: 16

Re: How do I keep robots away ?

ruud wrote #287811:

You have to download the existing .htaccess file (TXP needs that) and add the two lines I posted.

I did it right now.
And I guess it is working well… 30 minutes and no robots (from linode.com) untill now…

Thank you!

Offline

#13 2015-01-27 20:46:22

gaekwad
Server grease monkey
From: People's Republic of Cornwall
Registered: 2005-11-19
Posts: 4,737
GitHub

Re: How do I keep robots away ?

txpwayoflife wrote #287826:

30 minutes and no robots (from linode.com) untill now…

You do know this will stop people’s ifttt recipes running, right? If someone has chosen to check your RSS feed with ifttt then those recipes will no longer be able to check your site for updates.

That might make people sad. It would make me sad, that’s for sure. ifttt people need love, too!

Offline

#14 2015-01-27 22:22:36

txpwayoflife
Member
Registered: 2015-01-09
Posts: 16

Re: How do I keep robots away ?

gaekwad wrote #287837:

ifttt people need love, too!

You are so funny!

Are you sure? I think it will block only the robots ; )

Offline

#15 2015-01-27 22:33:50

gaekwad
Server grease monkey
From: People's Republic of Cornwall
Registered: 2005-11-19
Posts: 4,737
GitHub

Re: How do I keep robots away ?

txpwayoflife wrote #287838:

I think it will block only the robots ; )

It will block the robots. The ifttt robots are hitting your site, and they send the triggers to the users. They are not malicious robots, and not frequent enough to cause any problems. Right now, because the robots are blocked, the ifttt service cannot access your feed to update the subscribers. If you unblock them, then ifttt will work again.

Offline

Board footer

Powered by FluxBB