Go to main content

Textpattern CMS support forum

You are not logged in. Register | Login | Help

#286 2012-07-07 14:17:11

Gocom
Developer Emeritus
From: Helsinki, Finland
Registered: 2006-07-14
Posts: 4,533
Website

Re: smd_query: Talk to the database directly via SQL

piyashrija wrote:

I am new to textpattern and thought if i pass future id in article tag it should display future article as well.

Article tag is context sensitive, meaning that it show articles paged on given context, a requested page. It doesn’t have id attribute. Please see article_custom instead.

As far as that particular code sample goes, smd_query isn’t needed for that. That’s equivalent to, and requires no more than:

<txp:article_custom time="any" section="event">
	<txp:title />
</txp:article_custom>

Which is what Uli was referring to in his reply to you.

Last edited by Gocom (2012-07-07 14:18:50)

Offline

#287 2014-10-10 15:26:07

johnstephens
Plugin Author
From: Woodbridge, VA
Registered: 2008-06-01
Posts: 1,000
Website

Re: smd_query: Talk to the database directly via SQL

Question about sanitizing user input

I operate a site for an academic journal, and one feature request I’m working on is to improve the site search so that searching for an author’s name turns up useful results. Because an individual article may have multiple authors, I use a custom field to tie each article to the relevant author info wrangled by smd_bio, and searching for an author name doesn’t usually turn up relevant articles in the anywhere in first page of search results, even with wet_haystack running.

So here’s my idea: Compare the search query against names in the author table, and if there are hits, list the links to those author profile pages, which list all of an author’s articles.

Here’s the code I used, which seems to work:

<txp:smd_query
  break="li"
  column="name, RealName"
  limit="3"
  table="txp_users"
  where='RealName REGEXP "^(.*)<txp:page_url type="q"/>(.*)$"'
  wraptag="ul">
  <a href='/authors/{name}'>Articles by {RealName}</a>
</txp:smd_query>

There are several things I’d like to improve about it, but first I want to verify that this method isn’t a reckless fiasco: The SQL query takes user input. Is there anything I need to do, and I can do, to sanitize the input and make sure the query is secure?

This query will only match names in the author table if they contain the exact string in the search query, so searching for “Dan Tucker” or “Daniel Tucker” won’t return the profile for “Daniel B. Tucker”. One of the refinements I have in mind is running the search query through rah_replace before piping it into the query, to add a (.*) between distinct search keywords. That would allow both of those searches to return the right profile.

Another refinement I have in mind is to only run the author table query if there are multiple search keywords: I don’t want to try to guess an author name if someone searches for “Dan”, or other innocuous words that might be part of a name—“And”, or “The”, etc.

I only mention the refinements I’m thinking about in case they might bear on the security question.

Thanks in advance! I’m grateful for any wisdom you might offer!

Last edited by johnstephens (2014-10-10 15:26:55)

Offline

#288 2014-10-11 11:01:57

ruud
Developer Emeritus
From: a galaxy far far away
Registered: 2006-06-04
Posts: 5,068
Website

Re: smd_query: Talk to the database directly via SQL

where='RealName REGEXP "^(.*)<txp:page_url type="q"/>(.*)$"'

Is the same as:

where='RealName REGEXP "<txp:page_url type="q"/>"'

Can probably be done faster as:

where='INSTR(RealName, "<txp:page_url type="q"/>")'

Which is the same as:

where='INSTR(RealName, "<txp:php>page_url(array("type" => "q")</txp:php>")'

But needs SQL escaping:

where='INSTR(RealName, "<txp:php>doSlash(page_url(array("type" => "q"))</txp:php>")'

Offline

#289 2014-10-13 20:34:12

johnstephens
Plugin Author
From: Woodbridge, VA
Registered: 2008-06-01
Posts: 1,000
Website

Re: smd_query: Talk to the database directly via SQL

Thank you, Ruud!

I was able to adapt the PHP you wrote to my template.

I suppose you did not see any security hazards in using keywords from Textpattern’s search input (user input) with smd_query in this way?

As I noted in my post above, I wanted to make the search a little more robust than simply matching the literal search query. My revised version will now find the profile for “Daniel T. Example” when someone searches for “Daniel Example” or “Dan Example”. What I did was transform the raw search query so that regular expression search recognize all the distinct words:

<txp:variable name="author_search_query"><txp:pax_grep from="'\s*'" to="(.*)"><txp:page_url type="q"/></txp:pax_grep></txp:variable>
<txp:smd_query
  break="li"
  column="name, RealName"
  limit="3"
  table="txp_users"
  where='RealName REGEXP "<txp:php>echo doSlash(variable(array("name" => "author_search_query")));</txp:php>"'
  wraptag="ul">
  <a href='/authors/{name}'>Articles by {RealName}</a>
</txp:smd_query>

The only “problem” with the above code is that searching for “and” would turn up any author’s named “Alexander”, “Andrea”, or “Roland”; and searches for “the” would match any author’s named “Christopher”, “Nathaniel”, or “Stephen”. I’m not sure that’s a big deal—searching for single words like that seems unlikely. I imagine I could eliminate spurious results like this by triggering the smd_query if and only if there are multiple search terms, but I don’t know a simple way to do that.

My central concern is whether using input from Textpattern’s search query URL in a database query like this. I don’t know how to test it for vulnerabilities: I tried searching for "; SELECT id from textpattern where 1, and smd_query seemed to treat that in an appropriate way, without executing the SELECT separately.

That’s encouragin, but I wanted to check with wiser minds before publishing it.

Thank you again!

Offline

#290 2014-10-14 07:15:16

ruud
Developer Emeritus
From: a galaxy far far away
Registered: 2006-06-04
Posts: 5,068
Website

Re: smd_query: Talk to the database directly via SQL

Try replacing to="(.*)" with to="(.*) " and start the REGEX with (^| ).

As long as you doSlash the stuff you use in SQL queries you’re safe from SQL injections.

Offline

#291 2014-10-14 18:42:32

johnstephens
Plugin Author
From: Woodbridge, VA
Registered: 2008-06-01
Posts: 1,000
Website

Re: smd_query: Talk to the database directly via SQL

ruud wrote #284783:

As long as you doSlash the stuff you use in SQL queries you’re safe from SQL injections.

Thank you so much for your help! That’s exactly what I needed.

Try replacing to="(.*)" with to="(.*) " and start the REGEX with (^| ).

When I replace to="(.*)" with to="(.*) ", I no longer get any matches, not even a search for the exact string that is the RealName. Same if I use to="(.*)\s".

I think it might be due to the greedy repetition in the expression. In other applications that support regular expressions, like text editors and preg_replace(), I often use the expression (.*?) to match the minimal amount of text for the pattern, but the question mark doesn’t seem to work in MySQL’s REGEXP function.

I don’t know if this is resolvable, but the current version is a lot better at finding relevant results for author name searches than it was before, and I’m very happy about that!

Thank you again!

Offline

#292 2014-10-14 19:08:36

ruud
Developer Emeritus
From: a galaxy far far away
Registered: 2006-06-04
Posts: 5,068
Website

Re: smd_query: Talk to the database directly via SQL

Hmm… okay, forgot about the greediness, so try replacing to="(.*)" with to="[^[:blank:]]* ". That should do the trick.

Offline

#293 2014-10-14 19:22:12

johnstephens
Plugin Author
From: Woodbridge, VA
Registered: 2008-06-01
Posts: 1,000
Website

Re: smd_query: Talk to the database directly via SQL

Thanks. When I replace to="(.*)" with to="[^[:blank:]]* ", again I don’t get any matches, even for the exact string.

So when I search for “Dan Example”, <txp:page_url type="q"/> returns Dan Example, and the author_search_query variable would transform it into Dan(.*)Example, which matches the author entry “Daniel T. Example”.

But when I use to="[^[:blank:]]* " instead of to="(.*)", searches for “Dan Example”, “Daniel Example”, and even “Daniel T. Example” match nothing.

I appreciate your guidance!

Offline

#294 2014-10-15 10:55:20

ruud
Developer Emeritus
From: a galaxy far far away
Registered: 2006-06-04
Posts: 5,068
Website

Re: smd_query: Talk to the database directly via SQL

Hmm.. then I don’t see a way to solve this (excluding fulltext boolean MySQL search).

Offline

#295 2014-10-15 13:18:59

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 12,497
Website GitHub

Re: smd_query: Talk to the database directly via SQL

johnstephens wrote #284774:

<txp:variable name="author_search_query"><txp:pax_grep from="'\s*'" to="(.*)"><txp:page_url type="q"/></txp:pax_grep></txp:variable>...

I’m not entirely sure, but I think perhaps your from is wrong. When I tried it, the replacement string was inserted between every character. This, however, worked:

<txp:variable name="author_search_query"><txp:pax_grep
  from="'[\s]+'"
  to="(.*)?"><txp:page_url type="q"/></txp:pax_grep></txp:variable>

TEST VALUE: <txp:variable name="author_search_query" />

<txp:smd_query
  break="li"
  column="name, RealName"
  limit="3"
  table="txp_users"
  where='RealName REGEXP "<txp:php>echo doSlash(variable(array("name" => "author_search_query")));</txp:php>"'
  wraptag="ul">
  <a href='/authors/{name}'>Articles by {RealName}</a>
</txp:smd_query>

Matching one or more spaces versus your original zero or more seemed to do the trick. I might have missed some subtleties though, as I don’t have many users in my test installation. P.S. adding debug="1" to your smd_query tag will show you the query that the plugin is running, which can sometimes highlight weirdness and any quoting / doSlash() issues.

That said, I was hoping you’d be able to use smd_query’s url features to do this. Using urlfilter="/[\s]+/" urlreplace="(.*)?" ought to work. You’d benefit from the url param being automatically run through doSlash inside the plugin too which means you could just use ?q directly in your where attribute. Sadly, a slight oversight in the code means it won’t work that way — even in my in-development v0.60 at present.

The problem is that using q as a url param is ambiguous because Txp has an internal variable inside $pretext with the same name. Since url params are one of the lowest order citizens inside smd_query, they get processed last. Thus you can filter the param all you like, as soon as the smd_query parser sees pretext['q'] it uses that value instead of your processed url value, negating all the filtering you put in place. Drat.

I’ll need to fix that at some point so thanks, in a roundabout fashion, for highlighting this bug through your excellent bending of the plugin!


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Hire Txp Builders – finely-crafted code, design and Txp

Offline

#296 2014-10-15 15:24:35

johnstephens
Plugin Author
From: Woodbridge, VA
Registered: 2008-06-01
Posts: 1,000
Website

Re: smd_query: Talk to the database directly via SQL

Bloke wrote #284809:

I’m not entirely sure, but I think perhaps your from is wrong. When I tried it, the replacement string was inserted between every character.

Yes, I noticed that yesterday, I just didn’t have time to update the thread. (Making my assumption explicit in the above post prompted me to actually test it.) Thank you so much for the fix! It works like magic, and I added a space after the question mark in the to attribute of pax_grep as in Ruud’s example above.

Thank you both for your valiant support!

Last edited by johnstephens (2014-10-15 15:24:47)

Offline

#297 2014-10-15 16:20:57

candyman
Member
From: Italy
Registered: 2006-08-08
Posts: 684

Re: smd_query: Talk to the database directly via SQL

I’ve tried EJW’s search engine: impressive.
It would be an awesome TXP tip.

Last edited by candyman (2014-10-15 16:21:08)

Offline

#298 2015-08-11 18:09:01

jakob
Admin
From: Germany
Registered: 2005-01-20
Posts: 5,218
Website GitHub

Re: smd_query: Talk to the database directly via SQL

Hi Stef,

I think I have run into a problem with the calculation of the total {smd_allrows} when using the limit and offset attributes. Then again it could be a pebcak :-)

I’m using smd_query to query a non-txp database and using rah_swap to temporarily switch databases for the query. I’ve built a little filtering system with WHERE-query chunks and it works great except for pagination. smd_query’s own pageform won’t work in this case because smd_query tries to retrieve it from the txp_form table, which is not present in the temporarily switched over db-connection (non-txp). I realise there’s no way around that.

However, you handily provide {smd_allrows}, {smd_rows}, {smd_pages}, {smd_thispage}, and {smd_cursor} in the normal output, so I collect those once with the first record and save them in txp variables which can then be used to manually construct a numbered pagination list outside of the smd_query tag. The values for “items_per_page” and “page” attributes are passed as url variables (or else default values are used) and the corresponding values for the limit attribute (= items_per_page) and start offset row (= (page - 1) * items_per_page) are calculated and passed to smd_query as variables. I’m using the attribute pagevar="page" to track a custom paging urlvar and I’m not using mysql LIMIT to otherwise control output.

This is the basic tag setup (excuse the complex query):

<txp:variable name="matches_total" value="" />
<txp:rah_swap link="members_list">
  <txp:smd_query 
    query='SELECT LUM_User.UserID, LUM_User.RoleID, LUM_UserExtended.Title, LUM_UserExtended.FirstName, LUM_UserExtended.LastName, LUM_UserExtended.CompanyURL, LUM_UserExtended.Company, LUM_UserExtended.Postcode, LUM_UserExtended.City, LUM_UserExtended.Country, LUM_UserExtended.ShortDescription, LUM_UserExtended.Category, LUM_UserExtended.CheckCertificate, LUM_UserExtended.Certificate
  FROM `LUM_User`
  INNER JOIN `LUM_UserExtended`
  ON LUM_User.UserID=LUM_UserExtended.UserID
  WHERE
  (
       1 = 1
       <txp:variable name="query_name" />
       <txp:variable name="query_bereich" />
       <txp:variable name="query_plz" />
  )
  AND `RoleID` IN (30,31,40,41)
  <txp:variable name="query_sort" />'
    wraptag=""
    break=""
    breakclass=""
    html_id=""
    class=""
    label=""
    labeltag=""
    limit='<txp:variable name="query_limit" />'
    offset='<txp:variable name="query_offset" />'
    pagevar="page"
    debug="2"
    >
    … my output code …

    <txp:hide>+++ on first record, store pagination infos +++</txp:hide>
    <txp:if_variable name="matches_total" value="">
      <txp:variable name="matches_total">{smd_allrows}</txp:variable>
      <txp:variable name="matches_thispage_start">{smd_cursor}</txp:variable>
      <txp:variable name="matches_thispage">{smd_rows}</txp:variable>
      <txp:variable name="pages_total">{smd_pages}</txp:variable>
      <txp:variable name="pages_thispage">{smd_thispage}</txp:variable>
    </txp:if_variable>
  </txp:smd_query>
</txp:rah_swap>

Using debug="2" I see my query and all 244 matching records.

I used the txpvars to calculate the pagination and it is initially computed correctly and works nicely, however…

… as I page through the results, the value for {smd_allrows} reduces by the number of the row-offset, e.g. with {smd_rows} = 15, {smd_allrows} returns 229 on page 2, 214 on page 3 and so on for the corresponding items.
That causes the pagination to recalculate, but also curtails the output – at least I think so: If I have offset=“230” limit=“15” that should show the last 14 items but it returns none. Meanwhile, the debug info still lists all 244 matching records. I don’t see the {smd_…} replacement values in the debug output.

As soon as I remove pagevar="page", I can correctly see the entries, e.g. offset=“230” limit=“15” works and shows me the last 14 records, but all values are calculated from that point onwards, listed as 1 to 14 (presumably as smd_query no longer knows which page it is on, it assumes it is page 1). Now the {smd_…} infos do show in the debug output.

Do I have a thinking mistake? Or is the $replacements[] array items doing something different to the $repagement[] array items used for the page form.

I can probably work around this by tracking my own paging and calculating offset manually, but smd_query is so powerful, it’s more likely I’ve made a thinking mistake. Any ideas?


TXP Builders – finely-crafted code, design and txp

Offline

#299 2015-08-23 09:09:40

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 12,497
Website GitHub

Re: smd_query: Talk to the database directly via SQL

jakob wrote #294085:

I have run into a problem with the calculation of the total {smd_allrows} when using the limit and offset attributes.

Entirely probable. Thank you for the detailed rundown. I’ll try and replicate this on my server using your data and see if it can be fixed or worked around.


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Hire Txp Builders – finely-crafted code, design and Txp

Offline

#300 2016-02-12 10:24:15

sacripant
Plugin Author
From: Rhône — France
Registered: 2008-06-01
Posts: 479
Website

Re: smd_query: Talk to the database directly via SQL

Hi Steph,

Just a idea.
Sometime I create Json with your plugin, and it’s not super convenient.
Perhaps the plug-in could do that directly ? Create a new tag for that ?

<txp:smd_query_to_json query="SELECT * from txp_users" />

@+

Offline

Board footer

Powered by FluxBB