Go to main content

Textpattern CMS support forum

You are not logged in. Register | Login | Help

#21 2020-05-23 12:45:09

hilaryaq
Plugin Author
Registered: 2006-08-20
Posts: 263
Website

Re: Duplicate Content due to section and article URL

That would be extra protection alright!

I think if you have the noindex there wouldn’t be any need for a re-direct on top of that. In fact, even without the noindex, Google shouldn’t find the page (bar the caveat you mentioned above), but even if it did, your canonical is pointing to the section page, also with a noindex anyway so that seems pretty bulletproof to me!


…………………
I <3 txp
…………………

Offline

#22 2020-05-23 13:05:01

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 9,282
Website

Re: Duplicate Content due to section and article URL

@demoncleaner. Hadn’t thought of Google using referrer links from other sites to sniff out your content and index it that way. That’s sneaky. I’ve learned something today, thank you (and your SEO guru!)


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#23 2020-05-23 14:37:58

demoncleaner
Plugin Author
From: Germany
Registered: 2008-06-29
Posts: 104
Website

Re: Duplicate Content due to section and article URL

@hilaryaq: No I ment redirects maybe as the most secure way but could be a bit anoying unless someone finds a regex that properly works (which I did not so far). So you would have redirected everything that goes to any /section/title page back to /section and you would want to be able to create exceptions easily.

I think this solution you would use in case you have a running page which you haven´t taken the precautions (noindex to all single_articles) already when setting it up and google has found some orphaned pages already.

I did not ment to use redirects in combination with noindex.

I hoped the noindex-solution will be enough for a site that is freshly setup.
What I think is not 100% perfect is the fact, that for google the site still exists and maybe it is crawled some time, it is just not indexed. So having an eye on your crawling budget this is also not a perfect solution. But I am not 100% sure… it is just the way I understood it so far.

Offline

#24 2020-05-23 14:44:56

demoncleaner
Plugin Author
From: Germany
Registered: 2008-06-29
Posts: 104
Website

Re: Duplicate Content due to section and article URL

Writing the last thing I had this idea… wouldn´t it be cool to have a function like:

In case the meta url field of an article is filled the article exists as an URL in case it is empty, the URL (/section/title) just does not exist and the articles can be only used without any further worries of sitemap, noindex or redirects in an article_list context =)

Sounds simple. But I am sure it isn´t.
I have no idea if there is a realistic chance to create a plugin that does that.

I had to test what currently happens if you try to have an empty Meta URL field:
It does not work. If you empty and save, it will save it with your last entry/not change anything.

Last edited by demoncleaner (2020-05-23 14:45:43)

Offline

#25 2020-05-23 14:54:15

hilaryaq
Plugin Author
Registered: 2006-08-20
Posts: 263
Website

Re: Duplicate Content due to section and article URL

I get you!

It would be enough in my opinion too, the other way to think of it is that if it is too late and pages have been indexed, Google will actually self correct to your preferred url over time as long as you submit the sitemap, remove any internal links or rss pointing to the hidden page, and consistently link to the section page from social media etc. If Google saw the pages as duplicate and preferred the article over section, that situation can be reversed if you set your own canonical correctly and do all of the above.. eventually that single article will fall away from results and Google will correctly index the section page instead. Especially when you reinforce it with your canonical and internal site structure.

A lot of websites and cms systems have multiple ways of pointing to a single url, whether it’s with or without a trailing slash, urls with parameters etc so Google is quite used to being flexible as long as you are consistant with what you link to and send traffic to, and most importantly canonicals.

Hope this helped.

You might get some value out of this page also: https://forum.textpattern.com/viewtopic.php?id=50705

A few methods there on outputting single pages and how to differentiate between them you might find handy :)


…………………
I <3 txp
…………………

Offline

#26 2020-05-23 17:23:08

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 9,282
Website

Re: Duplicate Content due to section and article URL

demoncleaner wrote #323195:

In case the meta url field of an article is filled the article exists as an URL in case it is empty, the URL (/section/title) just does not exist and the articles can be only used without any further worries of sitemap

Ha, no sadly it’s not that simple. But how about this as a workaround… in Txp 4.8.0 we officially support pageless sections. Go ahead and try it:

  1. Pick (or create) a section that is going to house content you don’t want anyone to be able to reach from a URL. I often make a section called “Snippets” for this purpose to store content that admins can edit but doesn’t need its own page. In your case, how about you call it “bios”?
  2. Edit that section and assign the ‘empty’ page at the top of the list to it. You can assign the empty stylesheet too if you like, though it’s not necessary as the empty page alone will trigger this behaviour.
  3. Save. All content in that section is now invisible! No page = no template = no output.

Thus you cannot access any of that content directly. Google will 404 if it tries; everybody will 404. Previously, storing content you didn’t want publicly accessible had to be in forms (or stored in other clever ways) which meant that regular editors like Staff Writer or Copy Editor couldn’t change it easily.

That’s great, you say: invisible content. So how do I get at it then? Easy: from any other section that does have a page template (e.g. in your case “Teams”) you can just pull whatever you need in via <txp:article_custom section="bios" />. If you categorise your content in that section, you can even choose to pull out stuff in certain categories, or certain IDs or articles that have certain custom fields set.

Thus your ‘landing page’ gets all the content in one big wodge, but as the individual articles have no physical URL, there is no way they can be indexed by anyone. And any author with access to the Write panel can alter the content in these ‘hidden’ pageless sections.

Does that help?

Last edited by Bloke (2020-05-23 17:28:18)


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#27 2020-05-23 17:40:51

hilaryaq
Plugin Author
Registered: 2006-08-20
Posts: 263
Website

Re: Duplicate Content due to section and article URL

Stef that’s so handy for so many uses!!


…………………
I <3 txp
…………………

Offline

#28 2020-05-23 19:11:42

demoncleaner
Plugin Author
From: Germany
Registered: 2008-06-29
Posts: 104
Website

Re: Duplicate Content due to section and article URL

Indeed does that help! Thanks a lot Stef. I am quite often surprised about how little of the potential of textpattern I know.

That will help with my problem because I could create a section “team” (to be accessed) and on it I have all the articles I want from the section “bios” – which is an invisible section – for example. That makes a bit more sections for a typical website then it needed before but that would be totaly fine. And it seems to be the best solution for the orphaned articles to me.

Also in my multilanguage pages – which I lately create with a mixture of smd_query and adi_menu – that helps. Because here I typically use and article that contains all those single words or phrases of the website that are not big enough to have proprietary article and I attached it via article_custom on the top of each page I called it “language snippets” and put it on a section called the same. Because it is more convenient for the client only jumping into the articles and being able to edit the language snippets from there. I usually hide output forms from them. All too complicated.

With this new technique I do not have to worry about the created section. Because it is an invisible section.

Thanks a lot. Its brillant.

Offline

#29 2020-05-23 19:29:22

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 9,282
Website

Re: Duplicate Content due to section and article URL

demoncleaner wrote #323212:

in my multilanguage pages – which I lately create with a mixture of smd_query and adi_menu – that helps. Because here I typically use and article that contains all those single words or phrases of the website that are not big enough to have proprietary article and I attached it via article_custom on the top of each page I called it “language snippets” and put it on a section called the same.

In that case, you might be interested in Oleg’s novel approach to multi-lingual content in Txp 4.8. He’s documented it there but it’s basically content in pageless sections chained to articles in a visible section via a custom field to offer ‘translations’ of that content.

Until we nail true multi-lingual content in a future core version, this is the next best thing.


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#30 2020-05-23 20:14:34

demoncleaner
Plugin Author
From: Germany
Registered: 2008-06-29
Posts: 104
Website

Re: Duplicate Content due to section and article URL

Cool thanks. I´ll have a closer look. I am definitely interested.
But from what I see now, I think the URLs are not “speaking” on Olegs version, right?

In my approach I use arc_meta (not adi_menu as written above) to make the field description on the sections obsolete. Then I use this field to have an internal common name for a section through all the languages. Then I create an output form that can search with the help of smd_query for every pendant in the other language of the current section. Works pretty well with not too much fiddeling until you have a decent setup. With the help of the css selectbox on each section I define its language. So that the section is aware of its own language. Like that you can easily go from /kontakt straight to /contact when clicking on “en” being on the german version etc. Could explain it more detailed if that is of any interest but maybe not in this thread because it is already quite off-topic. Sorry.

Offline

Board footer

Powered by FluxBB