Textpattern CMS support forum
You are not logged in. Register | Login | Help
- Topics: Active | Unanswered
SQL query for similar (enought) articles
Hi folks ;)
Because I need to retrieve some correspondences from similar (enough) titles, I want to use a query based on the first 2 or 3 words:
global $thisarticle;
// Keep only 2 or 3 first words (depending) into the current article's title
$title = strtok( $thisarticle['title'], ' ' ).' '.strtok(' ');
// Try a query to find similar enough article into an another section
$rs = safe_row('Title, Status, Section, url_title', 'textpattern', "1=1 and Title like '".$title."_' and Section = 'other-section' and Status = '4'");
But, unfortunatly I haven’t any results.
Could you please correct me? Tks.
Last edited by Pat64 (2015-06-11 15:38:19)
Patrick.
Github | CodePen | Codier | Simplr theme | Wait Me: a maintenance theme | [\a mi.ni.ma]: a “Low Tech” simple Blog theme.
Offline
Re: SQL query for similar (enought) articles
Assuming your tokenization works (which you can verify by issuing dmp($title)
, try adding the wildcard %
character instead of the single-character wildcard (underscore):
... and Title like '".$title."%' and Section = 'other-section' ...
The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.
Txp Builders – finely-crafted code, design and Txp
Offline
Re: SQL query for similar (enought) articles
Thanks Stef for your reply !
The article tokenization works well but unfortunately the query still fails…
WTF.
Patrick.
Github | CodePen | Codier | Simplr theme | Wait Me: a maintenance theme | [\a mi.ni.ma]: a “Low Tech” simple Blog theme.
Offline
Re: SQL query for similar (enought) articles
Hmm, I haven’t tried it out so I’m not sure why offhand. Try adding , 1
just before the closing bracket of the safe_row()
call. That will dump out the query that the code is using. If you take that and paste it into phpMyAdmin’s Query tool and run it you’ll be able to see if it’s outputting anything at all. If not, then you can at least tweak the query there to try and figure out why.
Once you’ve got it going we can explore ways to make the query safe from injection attacks… but let’s get it working first.
The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.
Txp Builders – finely-crafted code, design and Txp
Offline
Re: SQL query for similar (enought) articles
OK.
Thanks lot for your advice, Stef.
I’ll try tomorow morning and give you the result.
;)
Patrick.
Github | CodePen | Codier | Simplr theme | Wait Me: a maintenance theme | [\a mi.ni.ma]: a “Low Tech” simple Blog theme.
Offline
Re: SQL query for similar (enought) articles
Ok. I found the problem and a kind of solution.
First of all, the article title contains changed signs in use by some languages as the French one: quotes '
(this default sign on keyboards) are transformed by a ’
sign (which is better).
Second, because I couldn’t get good results with the like
query argument, I tried REGEXP
instead with much better succes.
So here is the new query:
$rs = safe_row('Title, Status, Section, url_title', 'textpattern', "1=1 and Title REGEXP '".$title."' and Section = 'other-section' and Status = '4'");
which have good results.
Unfortunatly, I use 2 str_replace
PHP functions in order to sanitize my part (first 2 or 3 words) of the global $thisarticle['title']
Is there a better choice?
Last edited by Pat64 (2015-06-12 07:06:22)
Patrick.
Github | CodePen | Codier | Simplr theme | Wait Me: a maintenance theme | [\a mi.ni.ma]: a “Low Tech” simple Blog theme.
Offline
Re: SQL query for similar (enought) articles
Pat64 wrote #291505:
I use 2
str_replace
PHP functions
You can do it one str_replace()
if you pass arrays in:
str_replace(
array('1st_needle', '2nd_needle'),
array('1st_replacement', '2nd_replacement'),
$haystack
);
The arrays can be of any length, with any ‘missing’ replacements that don’t have a corresponding needle being set to the empty string.
EDIT: but see what makss wrote below, as there’s more to this than just replacing characters.
Last edited by Bloke (2015-06-12 10:38:19)
The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.
Txp Builders – finely-crafted code, design and Txp
Offline
Re: SQL query for similar (enought) articles
Pat64 wrote #291505:
First of all, the article title contains changed signs in use by some languages as the French one: quotes
'
(this default sign on keyboards) are transformed by a’
sign (which is better).
Always use mysql_real_escape_string()
or safe_escape()
in Txp.
$title = safe_escape($title); // before sql query
Get first 2-3-4… words from string. (Change {2}
and specify the required number of words)
$q = "qq111 ww222 e33333 444 5555 666"; if( preg_match('%^((.*? ){2})(.*)$%su', $q, $mm) ){ print_r($mm); }
Array ( [0] => qq111 ww222 e33333 444 5555 666 [1] => qq111 ww222 // first 2 words [2] => ww222 [3] => e33333 444 5555 666 // other part of string )
aks_cron : Cron inside Textpattern | aks_article : extended article_custom tag
aks_cache : cache for TxP | aks_dragdrop : Drag&Drop categories (article, link, image, file)
Offline
Re: SQL query for similar (enought) articles
makss wrote #291509:
Always use
mysql_real_escape_string()
orsafe_escape()
in Txp
Makss’ solution for extracting the words is far better. But try to avoid mysql_real_escape_string()
unless painted into a corner and you have no choice. It’s deprecated. Using safe_escape()
or other Txp core functions will at least permit your code to migrate seamlessly to PDO when that bit of the core is written, as the burden is on us devs to provide a short-term upgrade path.
We may well deprecate such functions too in favour of class methods at some point, but our functions are likely to be around a lot longer than PHP keeps its mysql_*()
calls.
The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.
Txp Builders – finely-crafted code, design and Txp
Offline
Re: SQL query for similar (enought) articles
Ok guys. I found the problem.
First of all, I changed this time consuming REGEXP
by LIKE
into my query (10 times faster):
$rs = safe_row('Title, Status, Section, url_title', 'textpattern', "1=1 and Title like '%".$title."%' and Section = 'other-section' and Status = '4'");
Second, simple '
quote signs are changed by utf-8 '
ones from my article’s title. Because MySQL stores only plain text, I can’t get the corresponding result from my query:
$title = safe_escape($thisarticle['title']);
Could you tell me please, the best choice to transform my article’s title into plain MySQL friendly text?
Thank you by advance.
Notice: all str_replace
PHP function applied on this text do not get good results…
Last edited by Pat64 (2015-06-15 07:29:04)
Patrick.
Github | CodePen | Codier | Simplr theme | Wait Me: a maintenance theme | [\a mi.ni.ma]: a “Low Tech” simple Blog theme.
Offline
Re: SQL query for similar (enought) articles
Pat64 wrote #291546:
simple
'
quote signs are changed by utf-8'
ones from my article’s title. Because MySQL stores only plain text, I can’t get the corresponding result from my query:
$title = safe_escape($thisarticle['title']);...
Singe quotes are transformed into '
when you call <txp:title />
, in $thisarticle['title']
they should be ok. The problem must come from somewhere else. You can try to apply htmlspecialchars_decode
to be sure:
$title = safe_escape(htmlspecialchars_decode($thisarticle['title'], ENT_QUOTES));
Offline
Re: SQL query for similar (enought) articles
Thanks lot etc ;)
Nope: '
are still here and htmlspecialchars_decode()
function didn’t do nothing…
Last edited by Pat64 (2015-06-15 08:36:49)
Patrick.
Github | CodePen | Codier | Simplr theme | Wait Me: a maintenance theme | [\a mi.ni.ma]: a “Low Tech” simple Blog theme.
Offline