Textpattern CMS support forum
You are not logged in. Register | Login | Help
- Topics: Active | Unanswered
[request] Alternate Search Behavior
I’m finding Textpattern’s default search behavior to be too limiting for a specific site we’re developing. I wonder if anyone knows of existing plugins that provide additional search options/capabilities… Searching/googling didn’t turn up anything obvious for me. (I found wet_haystack, which allows you to tweak the index – but not the actual search algorithm).
Offline
Re: [request] Alternate Search Behavior
Can you be more specific about what you’d like the search algorithm to differently from the currently used algorithm?
Offline
Re: [request] Alternate Search Behavior
Sure, Ruud. We’re developing a “help center” and the internal site search algorithm is too limiting. For example, the search phrase this is a test only matches if all 4 words appear in either the article title or body and only if present in that exact order.
I would like it to work more like a basic google search. For example, if the search phrase is entered in quotes (“this is a test”), then an exact match would be required. Otherwise, a simple boolean AND would be sufficient. (As long as each individual word in the search phrase appears at least once, in any order, a potential match occurs.)
Also, filtering out of common “noise” words (“a”, “and”, “of”, “the”, “is”, etc.) would be nice.
From a quick look at the code, the main limitation seems to be caused by use of RLIKE in the WHERE clause of the query. I haven’t looked at it too deeply, so maybe I am missing something and the behavior is more tunable than I think?
Offline
Re: [request] Alternate Search Behavior
- keywords that match 50+% of articles are ignored
- keywords by default must be 4 chars or longer (unless you override this in MySQL)
- stopwords (if configured) are ignored.
Last edited by ruud (2008-12-27 22:21:38)
Offline
Re: [request] Alternate Search Behavior
Sounds like a job for a plugin then. Is it possible to override the search code from a plugin? Any pointers in terms of best approach to take with this? Thanks.
Offline
Re: [request] Alternate Search Behavior
No way to override: either duplicate or hack.
Offline
Re: [request] Alternate Search Behavior
OK, I have this working. I have added a new “behavior” attribute to the search_input tag:
<txp:search_input behavior="strict" wraptag="p" />
Strict behavior is the same as the current behavior, and it is the default if the behavior attribute is omitted.
<txp:search_input behavior="permissive" wraptag="p" />
Permissive behavior is essentially what I described earlier in the thread. If the search term is enclosed in quotes, then this has the same effect as strict behavior. Otherwise, any articles matching all of the individual search terms, in any order, will be potential matches.
I will post the patch after doing a bit more testing. Feedback on the this approach welcomed.
One nice benefit of this implementation approach is that future search behaviors may be added without breaking existing search behaviors.
Last edited by artagesw (2008-12-28 01:54:47)
Offline
Re: [request] Alternate Search Behavior
artagesw
Hey I like this approach. Nice one. I’d like to see the patch and try it out sometime if you don’t mind. I might be able to roll something similar into smd_fuzzy_find to make the searches better. Thanks.
Last edited by Bloke (2008-12-28 09:49:27)
The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.
Txp Builders – finely-crafted code, design and Txp
Offline
Re: [request] Alternate Search Behavior
OK, I have refined this a bit and tested it some more. I have renamed the new attribute to be “match” and it can have one of three possible values:
<txp:search_input match="exact" wraptag="p" />
This gives you the current Txp search behavior, and it is the default if the match attribute is omitted. There are some minor improvements to the current behavior (trim leading/trailing spaces, collapse white space, use MySQL “LIKE” rather than “RLIKE” for efficiency, escape special characters, etc.)
<txp:search_input match="any" wraptag="p" />
When match = “any”, any articles containing at least one of the individual search terms will be potential matches.
<txp:search_input match="all" wraptag="p" />
When match = “all”, any articles containing all of the individual search terms, in any order, will be potential matches.
Regardless of the value of the match attribute, if the search phrase is quoted, then the exact match methodology is used.
There does not seem to be a way to attach a file to a forum post. What’s the best way to post a patch file for public consumption?
Last edited by artagesw (2008-12-28 20:58:50)
Offline
Re: [request] Alternate Search Behavior
There’s a mailing list.
Offline
Re: [request] Alternate Search Behavior
wet wrote:
There’s a mailing list.
Yes, I have posted patches to the dev list and can post this one there as well. But I imagine most txp users do not subscribe to it.
Offline
Re: [request] Alternate Search Behavior
Then I’d suggest to paste it as code (bc..
) here, upload it onto some public space or use pastebin.
Offline