Textpattern CMS support forum

You are not logged in. Register | Login | Help

#301 2014-10-13 20:34:12

johnstephens
Plugin Author
From: Woodbridge, VA
Registered: 2008-06-01
Posts: 989
Website

Re: smd_query: Talk to the database directly via SQL

Thank you, Ruud!

I was able to adapt the PHP you wrote to my template.

I suppose you did not see any security hazards in using keywords from Textpattern’s search input (user input) with smd_query in this way?

As I noted in my post above, I wanted to make the search a little more robust than simply matching the literal search query. My revised version will now find the profile for “Daniel T. Example” when someone searches for “Daniel Example” or “Dan Example”. What I did was transform the raw search query so that regular expression search recognize all the distinct words:

<txp:variable name="author_search_query"><txp:pax_grep from="'\s*'" to="(.*)"><txp:page_url type="q"/></txp:pax_grep></txp:variable>
<txp:smd_query
  break="li"
  column="name, RealName"
  limit="3"
  table="txp_users"
  where='RealName REGEXP "<txp:php>echo doSlash(variable(array("name" => "author_search_query")));</txp:php>"'
  wraptag="ul">
  <a href='/authors/{name}'>Articles by {RealName}</a>
</txp:smd_query>

The only “problem” with the above code is that searching for “and” would turn up any author’s named “Alexander”, “Andrea”, or “Roland”; and searches for “the” would match any author’s named “Christopher”, “Nathaniel”, or “Stephen”. I’m not sure that’s a big deal—searching for single words like that seems unlikely. I imagine I could eliminate spurious results like this by triggering the smd_query if and only if there are multiple search terms, but I don’t know a simple way to do that.

My central concern is whether using input from Textpattern’s search query URL in a database query like this. I don’t know how to test it for vulnerabilities: I tried searching for "; SELECT id from textpattern where 1, and smd_query seemed to treat that in an appropriate way, without executing the SELECT separately.

That’s encouragin, but I wanted to check with wiser minds before publishing it.

Thank you again!

Offline

#302 2014-10-14 07:15:16

ruud
Developer emeritus
From: a galaxy far far away
Registered: 2006-06-04
Posts: 5,068
Website

Re: smd_query: Talk to the database directly via SQL

Try replacing to="(.*)" with to="(.*) " and start the REGEX with (^| ).

As long as you doSlash the stuff you use in SQL queries you’re safe from SQL injections.

Offline

#303 2014-10-14 18:42:32

johnstephens
Plugin Author
From: Woodbridge, VA
Registered: 2008-06-01
Posts: 989
Website

Re: smd_query: Talk to the database directly via SQL

ruud wrote #284783:

As long as you doSlash the stuff you use in SQL queries you’re safe from SQL injections.

Thank you so much for your help! That’s exactly what I needed.

Try replacing to="(.*)" with to="(.*) " and start the REGEX with (^| ).

When I replace to="(.*)" with to="(.*) ", I no longer get any matches, not even a search for the exact string that is the RealName. Same if I use to="(.*)\s".

I think it might be due to the greedy repetition in the expression. In other applications that support regular expressions, like text editors and preg_replace(), I often use the expression (.*?) to match the minimal amount of text for the pattern, but the question mark doesn’t seem to work in MySQL’s REGEXP function.

I don’t know if this is resolvable, but the current version is a lot better at finding relevant results for author name searches than it was before, and I’m very happy about that!

Thank you again!

Offline

#304 2014-10-14 19:08:36

ruud
Developer emeritus
From: a galaxy far far away
Registered: 2006-06-04
Posts: 5,068
Website

Re: smd_query: Talk to the database directly via SQL

Hmm… okay, forgot about the greediness, so try replacing to="(.*)" with to="[^[:blank:]]* ". That should do the trick.

Offline

#305 2014-10-14 19:22:12

johnstephens
Plugin Author
From: Woodbridge, VA
Registered: 2008-06-01
Posts: 989
Website

Re: smd_query: Talk to the database directly via SQL

Thanks. When I replace to="(.*)" with to="[^[:blank:]]* ", again I don’t get any matches, even for the exact string.

So when I search for “Dan Example”, <txp:page_url type="q"/> returns Dan Example, and the author_search_query variable would transform it into Dan(.*)Example, which matches the author entry “Daniel T. Example”.

But when I use to="[^[:blank:]]* " instead of to="(.*)", searches for “Dan Example”, “Daniel Example”, and even “Daniel T. Example” match nothing.

I appreciate your guidance!

Offline

#306 2014-10-15 10:55:20

ruud
Developer emeritus
From: a galaxy far far away
Registered: 2006-06-04
Posts: 5,068
Website

Re: smd_query: Talk to the database directly via SQL

Hmm.. then I don’t see a way to solve this (excluding fulltext boolean MySQL search).

Offline

#307 2014-10-15 13:18:59

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 8,813
Website

Re: smd_query: Talk to the database directly via SQL

johnstephens wrote #284774:

<txp:variable name="author_search_query"><txp:pax_grep from="'\s*'" to="(.*)"><txp:page_url type="q"/></txp:pax_grep></txp:variable>...

I’m not entirely sure, but I think perhaps your from is wrong. When I tried it, the replacement string was inserted between every character. This, however, worked:

<txp:variable name="author_search_query"><txp:pax_grep
  from="'[\s]+'"
  to="(.*)?"><txp:page_url type="q"/></txp:pax_grep></txp:variable>

TEST VALUE: <txp:variable name="author_search_query" />

<txp:smd_query
  break="li"
  column="name, RealName"
  limit="3"
  table="txp_users"
  where='RealName REGEXP "<txp:php>echo doSlash(variable(array("name" => "author_search_query")));</txp:php>"'
  wraptag="ul">
  <a href='/authors/{name}'>Articles by {RealName}</a>
</txp:smd_query>

Matching one or more spaces versus your original zero or more seemed to do the trick. I might have missed some subtleties though, as I don’t have many users in my test installation. P.S. adding debug="1" to your smd_query tag will show you the query that the plugin is running, which can sometimes highlight weirdness and any quoting / doSlash() issues.

That said, I was hoping you’d be able to use smd_query’s url features to do this. Using urlfilter="/[\s]+/" urlreplace="(.*)?" ought to work. You’d benefit from the url param being automatically run through doSlash inside the plugin too which means you could just use ?q directly in your where attribute. Sadly, a slight oversight in the code means it won’t work that way — even in my in-development v0.60 at present.

The problem is that using q as a url param is ambiguous because Txp has an internal variable inside $pretext with the same name. Since url params are one of the lowest order citizens inside smd_query, they get processed last. Thus you can filter the param all you like, as soon as the smd_query parser sees pretext['q'] it uses that value instead of your processed url value, negating all the filtering you put in place. Drat.

I’ll need to fix that at some point so thanks, in a roundabout fashion, for highlighting this bug through your excellent bending of the plugin!


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#308 2014-10-15 15:24:35

johnstephens
Plugin Author
From: Woodbridge, VA
Registered: 2008-06-01
Posts: 989
Website

Re: smd_query: Talk to the database directly via SQL

Bloke wrote #284809:

I’m not entirely sure, but I think perhaps your from is wrong. When I tried it, the replacement string was inserted between every character.

Yes, I noticed that yesterday, I just didn’t have time to update the thread. (Making my assumption explicit in the above post prompted me to actually test it.) Thank you so much for the fix! It works like magic, and I added a space after the question mark in the to attribute of pax_grep as in Ruud’s example above.

Thank you both for your valiant support!

Last edited by johnstephens (2014-10-15 15:24:47)

Offline

#309 2014-10-15 16:20:57

candyman
Member
From: Italy
Registered: 2006-08-08
Posts: 684

Re: smd_query: Talk to the database directly via SQL

I’ve tried EJW’s search engine: impressive.
It would be an awesome TXP tip.

Last edited by candyman (2014-10-15 16:21:08)

Offline

#310 2015-08-11 18:09:01

jakob
Admin
From: Germany
Registered: 2005-01-20
Posts: 3,582
Website

Re: smd_query: Talk to the database directly via SQL

Hi Stef,

I think I have run into a problem with the calculation of the total {smd_allrows} when using the limit and offset attributes. Then again it could be a pebcak :-)

I’m using smd_query to query a non-txp database and using rah_swap to temporarily switch databases for the query. I’ve built a little filtering system with WHERE-query chunks and it works great except for pagination. smd_query’s own pageform won’t work in this case because smd_query tries to retrieve it from the txp_form table, which is not present in the temporarily switched over db-connection (non-txp). I realise there’s no way around that.

However, you handily provide {smd_allrows}, {smd_rows}, {smd_pages}, {smd_thispage}, and {smd_cursor} in the normal output, so I collect those once with the first record and save them in txp variables which can then be used to manually construct a numbered pagination list outside of the smd_query tag. The values for “items_per_page” and “page” attributes are passed as url variables (or else default values are used) and the corresponding values for the limit attribute (= items_per_page) and start offset row (= (page - 1) * items_per_page) are calculated and passed to smd_query as variables. I’m using the attribute pagevar="page" to track a custom paging urlvar and I’m not using mysql LIMIT to otherwise control output.

This is the basic tag setup (excuse the complex query):

<txp:variable name="matches_total" value="" />
<txp:rah_swap link="members_list">
  <txp:smd_query 
    query='SELECT LUM_User.UserID, LUM_User.RoleID, LUM_UserExtended.Title, LUM_UserExtended.FirstName, LUM_UserExtended.LastName, LUM_UserExtended.CompanyURL, LUM_UserExtended.Company, LUM_UserExtended.Postcode, LUM_UserExtended.City, LUM_UserExtended.Country, LUM_UserExtended.ShortDescription, LUM_UserExtended.Category, LUM_UserExtended.CheckCertificate, LUM_UserExtended.Certificate
  FROM `LUM_User`
  INNER JOIN `LUM_UserExtended`
  ON LUM_User.UserID=LUM_UserExtended.UserID
  WHERE
  (
       1 = 1
       <txp:variable name="query_name" />
       <txp:variable name="query_bereich" />
       <txp:variable name="query_plz" />
  )
  AND `RoleID` IN (30,31,40,41)
  <txp:variable name="query_sort" />'
    wraptag=""
    break=""
    breakclass=""
    html_id=""
    class=""
    label=""
    labeltag=""
    limit='<txp:variable name="query_limit" />'
    offset='<txp:variable name="query_offset" />'
    pagevar="page"
    debug="2"
    >
    … my output code …

    <txp:hide>+++ on first record, store pagination infos +++</txp:hide>
    <txp:if_variable name="matches_total" value="">
      <txp:variable name="matches_total">{smd_allrows}</txp:variable>
      <txp:variable name="matches_thispage_start">{smd_cursor}</txp:variable>
      <txp:variable name="matches_thispage">{smd_rows}</txp:variable>
      <txp:variable name="pages_total">{smd_pages}</txp:variable>
      <txp:variable name="pages_thispage">{smd_thispage}</txp:variable>
    </txp:if_variable>
  </txp:smd_query>
</txp:rah_swap>

Using debug="2" I see my query and all 244 matching records.

I used the txpvars to calculate the pagination and it is initially computed correctly and works nicely, however…

… as I page through the results, the value for {smd_allrows} reduces by the number of the row-offset, e.g. with {smd_rows} = 15, {smd_allrows} returns 229 on page 2, 214 on page 3 and so on for the corresponding items.
That causes the pagination to recalculate, but also curtails the output – at least I think so: If I have offset=“230” limit=“15” that should show the last 14 items but it returns none. Meanwhile, the debug info still lists all 244 matching records. I don’t see the {smd_…} replacement values in the debug output.

As soon as I remove pagevar="page", I can correctly see the entries, e.g. offset=“230” limit=“15” works and shows me the last 14 records, but all values are calculated from that point onwards, listed as 1 to 14 (presumably as smd_query no longer knows which page it is on, it assumes it is page 1). Now the {smd_…} infos do show in the debug output.

Do I have a thinking mistake? Or is the $replacements[] array items doing something different to the $repagement[] array items used for the page form.

I can probably work around this by tracking my own paging and calculating offset manually, but smd_query is so powerful, it’s more likely I’ve made a thinking mistake. Any ideas?


TXP Builders – finely-crafted code, design and txp

Offline

Board footer

Powered by FluxBB