Go to main content

Textpattern CMS support forum

You are not logged in. Register | Login | Help

#1 2008-12-27 20:38:20

artagesw
Member
From: Seattle, WA
Registered: 2007-04-29
Posts: 227
Website

[request] Alternate Search Behavior

I’m finding Textpattern’s default search behavior to be too limiting for a specific site we’re developing. I wonder if anyone knows of existing plugins that provide additional search options/capabilities… Searching/googling didn’t turn up anything obvious for me. (I found wet_haystack, which allows you to tweak the index – but not the actual search algorithm).

Offline

#2 2008-12-27 20:58:10

ruud
Developer Emeritus
From: a galaxy far far away
Registered: 2006-06-04
Posts: 5,068
Website

Re: [request] Alternate Search Behavior

Can you be more specific about what you’d like the search algorithm to differently from the currently used algorithm?

Offline

#3 2008-12-27 21:09:18

artagesw
Member
From: Seattle, WA
Registered: 2007-04-29
Posts: 227
Website

Re: [request] Alternate Search Behavior

Sure, Ruud. We’re developing a “help center” and the internal site search algorithm is too limiting. For example, the search phrase this is a test only matches if all 4 words appear in either the article title or body and only if present in that exact order.

I would like it to work more like a basic google search. For example, if the search phrase is entered in quotes (“this is a test”), then an exact match would be required. Otherwise, a simple boolean AND would be sufficient. (As long as each individual word in the search phrase appears at least once, in any order, a potential match occurs.)

Also, filtering out of common “noise” words (“a”, “and”, “of”, “the”, “is”, etc.) would be nice.

From a quick look at the code, the main limitation seems to be caused by use of RLIKE in the WHERE clause of the query. I haven’t looked at it too deeply, so maybe I am missing something and the behavior is more tunable than I think?

Offline

#4 2008-12-27 22:20:19

ruud
Developer Emeritus
From: a galaxy far far away
Registered: 2006-06-04
Posts: 5,068
Website

Re: [request] Alternate Search Behavior

No, it’s not really tunable unless you change that part of the code. You could change it to use MySQL Boolean FULLTEXT search, but there are three issues to consider:
  • keywords that match 50+% of articles are ignored
  • keywords by default must be 4 chars or longer (unless you override this in MySQL)
  • stopwords (if configured) are ignored.

Last edited by ruud (2008-12-27 22:21:38)

Offline

#5 2008-12-27 22:59:10

artagesw
Member
From: Seattle, WA
Registered: 2007-04-29
Posts: 227
Website

Re: [request] Alternate Search Behavior

Sounds like a job for a plugin then. Is it possible to override the search code from a plugin? Any pointers in terms of best approach to take with this? Thanks.

Offline

#6 2008-12-27 23:09:53

ruud
Developer Emeritus
From: a galaxy far far away
Registered: 2006-06-04
Posts: 5,068
Website

Re: [request] Alternate Search Behavior

No way to override: either duplicate or hack.

Offline

#7 2008-12-28 01:52:44

artagesw
Member
From: Seattle, WA
Registered: 2007-04-29
Posts: 227
Website

Re: [request] Alternate Search Behavior

OK, I have this working. I have added a new “behavior” attribute to the search_input tag:

<txp:search_input behavior="strict" wraptag="p" />

Strict behavior is the same as the current behavior, and it is the default if the behavior attribute is omitted.

<txp:search_input behavior="permissive" wraptag="p" />

Permissive behavior is essentially what I described earlier in the thread. If the search term is enclosed in quotes, then this has the same effect as strict behavior. Otherwise, any articles matching all of the individual search terms, in any order, will be potential matches.

I will post the patch after doing a bit more testing. Feedback on the this approach welcomed.

One nice benefit of this implementation approach is that future search behaviors may be added without breaking existing search behaviors.

Last edited by artagesw (2008-12-28 01:54:47)

Offline

#8 2008-12-28 09:49:08

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 11,447
Website GitHub

Re: [request] Alternate Search Behavior

artagesw

Hey I like this approach. Nice one. I’d like to see the patch and try it out sometime if you don’t mind. I might be able to roll something similar into smd_fuzzy_find to make the searches better. Thanks.

Last edited by Bloke (2008-12-28 09:49:27)


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#9 2008-12-28 20:30:01

artagesw
Member
From: Seattle, WA
Registered: 2007-04-29
Posts: 227
Website

Re: [request] Alternate Search Behavior

OK, I have refined this a bit and tested it some more. I have renamed the new attribute to be “match” and it can have one of three possible values:

<txp:search_input match="exact" wraptag="p" />

This gives you the current Txp search behavior, and it is the default if the match attribute is omitted. There are some minor improvements to the current behavior (trim leading/trailing spaces, collapse white space, use MySQL “LIKE” rather than “RLIKE” for efficiency, escape special characters, etc.)

<txp:search_input match="any" wraptag="p" />

When match = “any”, any articles containing at least one of the individual search terms will be potential matches.

<txp:search_input match="all" wraptag="p" />

When match = “all”, any articles containing all of the individual search terms, in any order, will be potential matches.

Regardless of the value of the match attribute, if the search phrase is quoted, then the exact match methodology is used.

There does not seem to be a way to attach a file to a forum post. What’s the best way to post a patch file for public consumption?

Last edited by artagesw (2008-12-28 20:58:50)

Offline

#10 2008-12-28 20:58:33

wet
Developer Emeritus
From: Schoerfling, Austria
Registered: 2005-06-06
Posts: 3,330
Website Mastodon

Re: [request] Alternate Search Behavior

There’s a mailing list.

Offline

#11 2008-12-28 21:00:17

artagesw
Member
From: Seattle, WA
Registered: 2007-04-29
Posts: 227
Website

Re: [request] Alternate Search Behavior

wet wrote:

There’s a mailing list.

Yes, I have posted patches to the dev list and can post this one there as well. But I imagine most txp users do not subscribe to it.

Offline

#12 2008-12-28 21:07:32

wet
Developer Emeritus
From: Schoerfling, Austria
Registered: 2005-06-06
Posts: 3,330
Website Mastodon

Re: [request] Alternate Search Behavior

Then I’d suggest to paste it as code (bc.. ) here, upload it onto some public space or use pastebin.

Offline

Board footer

Powered by FluxBB