Go to main content

Textpattern CMS support forum

You are not logged in. Register | Login | Help

#529 2025-01-07 12:09:31

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 11,560
Website GitHub

Re: smd_tags: unlimited article, image, file and link taxonomy

Okay, that’s fair. I wonder if we just ditch sanitizeForUrl() around the tag creation entirely? Is that valid? There might be one or two other places where (for example) when searching to match tags it also sanitizes them, so they may all need to be ferreted out and removed.

If there’s no intrinsic need to convert the tags to the Latin alphabet, I’m all for removing that restriction in the plugin.


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#530 2025-01-07 12:49:42

spiridon
Member
From: Mariupol, Ukraine
Registered: 2011-01-30
Posts: 57
Website

Re: smd_tags: unlimited article, image, file and link taxonomy

I am planning to migrate from tru_tags to smd_tags. So I can do an experiment. I will remove the sanitizeForUrl() calls in the plugin and try to adapt the site. I will see how it will work. I will test and share the results with you. So then you will have more data to make the right decision.

I have one question. The tag name is converted to uppercase. Does it make sense? And what about best practices?

Last edited by spiridon (2025-01-07 12:50:49)

Offline

#531 2025-01-07 12:52:13

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 11,560
Website GitHub

Re: smd_tags: unlimited article, image, file and link taxonomy

Thank you for testing.

The tag name shouldn’t be converted to any case at all. If anything, I would expect I’d have lowercased it, but I’d rather it left it alone entirely. Not sure where the case change is occurring but if you spot it, let me know and I’ll get rid of it.


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#532 2025-01-10 10:30:40

spiridon
Member
From: Mariupol, Ukraine
Registered: 2011-01-30
Posts: 57
Website

Re: smd_tags: unlimited article, image, file and link taxonomy

I have some interesting news.

First, I managed to fix an error that occurred in the Chrome browser. It was not difficult and the fix is very small.

But it turned out that there is another problem. I had to add debugging and I saw that 2051 articles are processed normally and 37 are not. The problem is that smd_tags_import_one() sends invalid XML in some cases. The errors are as follows

error on line 3 at column 75: Entity 'rsquo' not defined

This does not affect imports, but it gives false information about the result. And it also destroys the proper operation of the call chain to which I converted all requests.

So I am temporarily just the line:

   send_xml_response(array('smd_tags_report' => $report, 'smd_tags_link_ctr' => $tag_ctr));

replaced it with:

   send_xml_response(array('smd_tags_report' => 'report', 'smd_tags_link_ctr' => $tag_ctr));

For now, it’s like this. I continue to experiment and investigate the plugin.

Offline

#533 2025-01-10 10:38:29

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 11,560
Website GitHub

Re: smd_tags: unlimited article, image, file and link taxonomy

Interesting. Thanks for the sleuthing.

If it’s a quoting issue or missing entities then the plugin should defend against those to avoid invalid XML. Either by quoting stuff properly, validating the XML somehow, or by skipping records that are broken in some way, so that the function completes properly. Perhaps it should be updated to use a promise or try/always.


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#534 2025-01-10 13:33:30

spiridon
Member
From: Mariupol, Ukraine
Registered: 2011-01-30
Posts: 57
Website

Re: smd_tags: unlimited article, image, file and link taxonomy

I think there is a bug in the TXP core. send_xml_response() is not working correctly. You just need to replace the code for escaping characters with the following:

$v = htmlspecialchars($value, ENT_QUOTES);

and

$value = htmlspecialchars($value, ENT_QUOTES);

and everything will work as expected.

Offline

#535 2025-01-10 22:16:26

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 11,560
Website GitHub

Re: smd_tags: unlimited article, image, file and link taxonomy

Hmmm, that’s a tricky one to balance. While it’s tempting, there’s the potential issue of double-encoding the quotes. Anyone using this function would need to ensure no such encoding was done beforehand. Can we guarantee that?

I suspect the plugin is at fault and should apply more rigorous tests to each record to guarantee well-formedness prior to sending it to core.


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#536 2025-01-26 16:36:38

spiridon
Member
From: Mariupol, Ukraine
Registered: 2011-01-30
Posts: 57
Website

Re: smd_tags: unlimited article, image, file and link taxonomy

I opened a merge request to fix a memory outage error in Chrome during an import with a large number of records.

Regarding the conversion of tag names to the Latin alphabet…
Most major players use Cyrillic in URLs without encoding.
Whether to use %20 or + instead of spaces – both exist here. I would say + is, as of today, a more historical format. I even saw somewhere that it is called “google notation”. That is, it has a shade of old-worldness, like an aristocrat’s letter.





tru_tags uses google notation but I think we can use %20.
It seems that works but I need more time to play with it.

Offline

#537 2025-01-26 16:38:58

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 11,560
Website GitHub

Re: smd_tags: unlimited article, image, file and link taxonomy

spiridon wrote #338913:

I opened a merge request to fix a memory outage error in Chrome during an import with a large number of records.

Merged it already, thank you so much :)

Regarding the conversion of tag names to the Latin alphabet…

Should I just ditch the ASCII down-conversion then? Happy to do so.


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#538 2025-01-26 16:46:17

spiridon
Member
From: Mariupol, Ukraine
Registered: 2011-01-30
Posts: 57
Website

Re: smd_tags: unlimited article, image, file and link taxonomy

Bloke wrote #338914:

Should I just ditch the ASCII conversion then? Happy to do so.

With a 99.9% probability, yes.
I just replaced sanitizeForUrl($tag_name) with $tag_name

It seems to work, but give me a couple more days to check.
I also got a trick with languages – I have a multilingual site. I’ll share my experience a little later.

Offline

#539 2025-01-26 16:51:25

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 11,560
Website GitHub

Re: smd_tags: unlimited article, image, file and link taxonomy

spiridon wrote #338915:

With a 99.9% probability, yes. I just replaced sanitizeForUrl($tag_name) with $tag_name

That sounds like a plan. I might do it anyway now, and I’ll release an updated version when you’ve had time to play some more.

I also got a trick with languages – I have a multilingual site. I’ll share my experience a little later.

Oooh, that sounds intriguing.


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#540 2025-01-26 17:01:58

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 11,560
Website GitHub

Re: smd_tags: unlimited article, image, file and link taxonomy

I presume it’s safe to just remove all 8 occurrences of sanitizeForUrl()? I don’t think it impacts backwards compatibility, because any tags (and their parents?) that have already been created will still match when clicked, it’s just any new ones won’t be converted. Umm, I think…

EDIT: I’m also thinking it might be worth getting rid of the strtolower() too. Who cares if tags are mixed case? The only decision then is whether to match them in a case-insensitive manner all the time (e.g. when matching via the URL) or whether to only match them that way if the database table collation is case sensitive.

I think matching regardless of case is the sensible thing to do because you’d get pretty annoyed if you searched tag=england and it came up with nothing because it was stored in the DB as tag=England.

Last edited by Bloke (2025-01-26 17:15:24)


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

Board footer

Powered by FluxBB