Go to main content

Textpattern CMS support forum

You are not logged in. Register | Login | Help

#1 2016-01-28 15:03:05

colak
Admin
From: Cyprus
Registered: 2004-11-20
Posts: 9,011
Website GitHub Mastodon Twitter

Site content stollen

Do any of you have any idea on what I can do with http://hur609.dtudtu.com? It seems that it is stealing our site’s content including our updates.


Yiannis
——————————
NeMe | hblack.art | EMAP | A Sea change | Toolkit of Care
I do my best editing after I click on the submit button.

Offline

#2 2016-01-28 16:47:19

mrdale
Member
From: Walla Walla
Registered: 2004-11-19
Posts: 2,215
Website

Re: Site content stollen

reject traffic from that range of IPs? might be start

Order Deny,Allow
Deny from 23.108.34.

Offline

#3 2016-01-29 07:13:19

colak
Admin
From: Cyprus
Registered: 2004-11-20
Posts: 9,011
Website GitHub Mastodon Twitter

Re: Site content stollen

Thanks Dale,

After posting here I managed to found the exact IP and blocked it. Is it actually better to block the whole range?


Yiannis
——————————
NeMe | hblack.art | EMAP | A Sea change | Toolkit of Care
I do my best editing after I click on the submit button.

Offline

#4 2016-01-29 11:15:54

gaekwad
Server grease monkey
From: People's Republic of Cornwall
Registered: 2005-11-19
Posts: 4,137
GitHub

Re: Site content stollen

Scraper isn’t necessarily on that host. Could be anywhere, really.

DMCA will get the process started to remove duplicate content from Google (as a start, other search engines are available):

support.google.com/legal/troubleshooter/1114905?hl=en

Also, fire off an email to the abuse contact of the domain:

whois.domaintools.com/dtudtu.com

Offline

#5 2016-01-30 07:14:16

colak
Admin
From: Cyprus
Registered: 2004-11-20
Posts: 9,011
Website GitHub Mastodon Twitter

Re: Site content stollen

Thanks Pete,

I’m not going to post a dmca notice as we have been victims of that process but also because much of our site is under a creative commons licence.

At the moment I am blocking their IP which works just fine but I am also keeping an eye on them.


Yiannis
——————————
NeMe | hblack.art | EMAP | A Sea change | Toolkit of Care
I do my best editing after I click on the submit button.

Offline

#6 2016-01-30 12:35:34

jakob
Admin
From: Germany
Registered: 2005-01-20
Posts: 4,596
Website

Re: Site content stollen

Might you be able to make the content less attractive for copying, for example by including information on yourselves as originators in the page content that is copied? That makes it less attractive to copy, and if they copy regardless, you are at least named (and linked to) as originators for anyone who might be reading elsewhere.


TXP Builders – finely-crafted code, design and txp

Offline

#7 2016-01-30 15:32:59

colak
Admin
From: Cyprus
Registered: 2004-11-20
Posts: 9,011
Website GitHub Mastodon Twitter

Re: Site content stollen

jakob wrote #297652:

Might you be able to make the content less attractive for copying, for example by including information on yourselves as originators in the page content that is copied? That makes it less attractive to copy, and if they copy regardless, you are at least named (and linked to) as originators for anyone who might be reading elsewhere.

Hi Julian,

Indeed this is part of our plan for the next incarnation of the site. For years we had it as a portal for news, calls and texts but we are discussing about ow this could be enriched and modernised. Unlike with our IMCA site, for which we reached a consensus very fast, this one is more complex as there are far too many ideas and our committee is yet to agree on everything. In a sense we do have enough clues on the page regarding the NGO but maybe we should just replace those with an about text.


Yiannis
——————————
NeMe | hblack.art | EMAP | A Sea change | Toolkit of Care
I do my best editing after I click on the submit button.

Offline

#8 2016-01-30 16:53:31

jakob
Admin
From: Germany
Registered: 2005-01-20
Posts: 4,596
Website

Re: Site content stollen

colak wrote #297655:

In a sense we do have enough clues on the page regarding the NGO but maybe we should just replace those with an about text.

It obviously depends how much “criminal energy” people put into it, but if you make it too easy to strip out, or place it in some other part of the page DOM (e.g. not in the content div), you may risk it being omitted when they suck in the data… When your attribution is embedded (ideally not in an easy-to-remove pattern), it means other people have to put in effort to remove it … and copycats are by definition lazy.

FWIW, an association I work for had a similar problem a few years back. We crafted a series of explanatory texts on various aspects of the association’s field of work and then found that they started to be used by various other sites, often without attribution. In the end we decided as a non-profit association funded almost entirely by membership fees that we weren’t going to follow up members who used the information as we are ultimately working in their interest. We encourage them to link back to the source, but many don’t, preferring to present it as their own expertise. In the end, although that wasn’t the original intention, it also meant that good texts were disseminated by the “copycats” which turns out to be good for the sector as a whole by reducing the amount of misinformation.
Parallel to that, we introduced a series of “member of …” membership badges that members can put on their homepage with a link back to their profile page on our homepage, and many do that. That brought some of the “copycats” back into the network albeit via another channel. There are a few other parallel organisations and portals that have copied the text, and I fretted a bit about google punishing multiple copies of it, but in the end we have plenty of other information on the subject matter and consistently rank higher on search engine listings, so if anything the copying didn’t weaken the association’s position but strengthened it.


TXP Builders – finely-crafted code, design and txp

Offline

Board footer

Powered by FluxBB