Textpattern CMS support forum
You are not logged in. Register | Login | Help
- Topics: Active | Unanswered
Pages: 1
Pageless section indexed by googlebot
Or at least, the/some articles in that pageless section have been found and the Gaggle search console is angry at me (it flags a 404 for those articles).
The 404 response is of course expected, manually trying that returns a “unknown” section with the 404 error code. I currently have no idea how that happened. As far as I can see the section never appears in a <txp:section_list />
. The actual section page that displays those articles has no link to the individual articles (<txp:permlink /
) not in the template(s) no in the output.
I tried adding a RedirectMatch 308 ^\/hidden-section-name\/(.*)$ https://domain.tld/real-section/
to the htaccess file (permanent redirect), but that does not work. The Textpattern URL handler takes over – sees an unknown section and issues a 404.
Question: how can I – for googlebot – send a redirect permanent instead (in theory my regex above does work). I’ve already added the pageless section name to the robots.txt file.
Edit: typo: Redirect
->@RedirectMatch@
(wrong copy pasting, it was correct in the htaccess)
Last edited by phiw13 (2025-06-03 00:59:42)
Where is that emoji for a solar powered submarine when you need it ?
Sand space – admin theme for Textpattern
phiw13 on Codeberg
Offline
Re: Pageless section indexed by googlebot
Two guesses:
- If you have a sitemap that loops over all sections, maybe the pageless sections haven’t been excluded?
- Maybe the rss/atom feed includes pageless sections if you don’t mark them as not to be syndicated?
It’s possible that I’m not looking at the right function, but this function only seems to check against whether a section is in_rss.
TXP Builders – finely-crafted code, design and txp
Offline
Re: Pageless section indexed by googlebot
phiw13 wrote #339759:
I tried adding a
Redirect 308 ^\/hidden-section-name\/(.*)$ https://domain.tld/real-section/
to the htaccess file (permanent redirect), but that does not work. The Textpattern URL handler takes over – sees an unknown section and issues a 404.
That’s strange, htaccess
should act before txp (and even php) is loaded.
Question: how can I – for googlebot – send a redirect permanent instead (in theory my regex above does work). I’ve already added the pageless section name to the robots.txt file.
You can try (in 4.9 at least)
<txp:if_section not name>
<!-- redirect with txp:header here -->
</txp:if_section>
or
<txp:if_article_section not name>
<!-- redirect with txp:header here -->
</txp:if_article_section>
in your error page template (untested).
Offline
Re: Pageless section indexed by googlebot
jakob wrote #339760:
Two guesses:
- If you have a sitemap that loops over all sections, maybe the pageless sections haven’t been excluded?
- Maybe the rss/atom feed includes pageless sections if you don’t mark them as not to be syndicated?
The pageless section(s) are not included in <txp:section_list />
or <txp:article_custom />
by default. For article tags, in a listing context they can appear, if listed explicitly, in my testing. The hidden pageless section is excluded from the feeds and frontpage on sections admin page. It is included in search but my search result template is build to use a custom URL (with custom article URL-title) for those articles.
<txp:if_article_section name="hidden_section">
<h2><a href="<txp:site_url />real-section/#<txp:article_url_title />">About: <txp:title /></a></h2>
<p><txp:search_result_excerpt hilight="mark" /></p>
<txp:else />
[…]
</txp:if_article_section>
PS – this is the type of pageless section I am talking about:https://textpattern.com/weblog/feature-focus-live-pageless-sections-for-hidden-content
etc wrote #339761:
That’s strange,
htaccess
should act before txp (and even php) is loaded.[…]
I will doublecheck my htaccess snippet and try in those checks in the default error template. One of them will do, I am sure
Thank you both.
Last edited by phiw13 (2025-06-02 08:33:36)
Where is that emoji for a solar powered submarine when you need it ?
Sand space – admin theme for Textpattern
phiw13 on Codeberg
Offline
Re: Pageless section indexed by googlebot
Update: I got the htaccess redirect to work smoothly once i corrected a typo in my section name (people-articles
is not the same as people-article
… the latter is what I needed).
I did not succeed in having the redirect to work in the default error template. Only a request for the pageless section name worked, not for ind. article. people-article
redirect ok but people-article/name1
failed. I think I’d need to something with <txp:page_url type="request_uri" />
or similar, but I haven’t tried yet.
Still I have no idea how googlebot could find that hidden pageless section…
Last edited by phiw13 (2025-06-03 01:14:32)
Where is that emoji for a solar powered submarine when you need it ?
Sand space – admin theme for Textpattern
phiw13 on Codeberg
Offline
Pages: 1