Go to main content

Textpattern CMS support forum

You are not logged in. Register | Login | Help

#1 2011-11-26 02:01:34

mericson
Member
Registered: 2004-05-24
Posts: 137
Website

SQL Performance of Textpattern queries

I’m noticing recent terrible performance on my Textpattern site, and have tracked it down to SQL queries generated by Textpattern. The worst performing query is generated for the tag <txp:article_custom limit=“6” section=“blog”>

There are ~700 rows in my textpattern table, and it looks like the explain is suggesting it will use the correct indices. A query of the entire table select * from textpattern; takes 0.0128sec, but the query generated by textpattern takes 6-11 seconds (I have seen an occasional 2-second query). Even a select * from textpattern order by Posted desc; is subsecond.

FYI, the Textpattern query is the following?

select *, unix_timestamp(Posted) as uPosted, unix_timestamp(Expires) as uExpires, unix_timestamp(LastMod) as uLastMod from textpattern where 1=1 and Status = 4 and Posted <= now() and (now() <= Expires or Expires = ’0000-00-00 00:00:00’) and Section IN (‘blog’) order by Posted desc limit 0, 6

Another query that is slow, and probably shouldn’t be, is the queries generated for <txp:article form=“home” listform=“home” /> which I saw take ~ 5-seconds !

Any suggestions on figuring out this problem? Is this a Textpattern schema issue, or possibly an issue with my hosting provider’s database. If the latter, any suggestions on working with them? They may not be very sympathetic suggesting the issue is with the schema/indices/code rather than their database.

Thanks,

Mark

Offline

#2 2011-11-26 02:25:09

Gocom
Developer Emeritus
From: Helsinki, Finland
Registered: 2006-07-14
Posts: 4,533
Website

Re: SQL Performance of Textpattern queries

While some of the queries Textpattern does are not perfect (for example the ones you mentioned, can not be cached, use wildcards, and might have potential time/timezone/leap second issues), the queries shouldn’t take that long at all — while not perfect, they are still very fine. Those queries you posted, should be very, very fast no matter how many rows the table has. Both queries select just few rows, and used where clauses are constructed well.

If you want, you could post your tag trace from the problematic page. With the tag trace we could make sure the problem comes from the queries, and see if there are other problem areas.

If the queries take that long, there is something wrong with the server or setup. Either too high load for the available resources, too low restrictions or something is broken (or braking). Also going way too finger happy with Textpattern’s custom fields can cause issues (especially on high traffic sites/low memory servers). Like for example creating too many longtext fields and then storing 500 pages long books in them. I would first make sure that the table(s) is fine condition (by running check/optimize/repair to textpattern table, via phpmyadmin/etc or cli), and posting your tag trace would allow rest of us to take closer look.

Last edited by Gocom (2011-11-26 02:34:52)

Offline

#3 2011-11-26 04:42:18

mericson
Member
Registered: 2004-05-24
Posts: 137
Website

Re: SQL Performance of Textpattern queries

Thanks, the tag trace is rather long so didn’t post the whole thing. I found the worst performing tags (for SQL), the rest don’t look too bad.

With further testing the performance issue seems intermittent, so I suspect the shared database with my hosting provider is overburdened.

However, burdened or not, that query does seem to be significantly slower than a query for all rows on the same table.

Offline

#4 2011-11-26 15:49:19

ruud
Developer Emeritus
From: a galaxy far far away
Registered: 2006-06-04
Posts: 5,068
Website

Re: SQL Performance of Textpattern queries

Perhaps you’re missing an index on the main textpattern table. I did a quick test on a TXP install that has 2500+ articles, most of them in the category ‘news’. Selecting all rows is slower here:

SELECT *
FROM `txp_textpattern`
WHERE 1
LIMIT 0 , 9999

Status 	Time
starting 	0.000020
checking query cache for query 	0.000041
Opening tables 	0.000016
System lock 	0.000007
Table lock 	0.000024
init 	0.000042
optimizing 	0.000010
statistics 	0.000016
preparing 	0.000014
executing 	0.000005
Sending data 	0.068973
end 	0.000019
query end 	0.000006
freeing items 	0.000048
logging slow query 	0.000006
cleaning up 	0.000007
Showing rows 0 - 2564 (2,565 total, Query took 0.0696 sec)
SELECT * , unix_timestamp( Posted ) AS uPosted, unix_timestamp( Expires ) AS uExpires, unix_timestamp( LastMod ) AS uLastMod
FROM txp_textpattern
WHERE 1 =1
AND STATUS =4
AND Posted <= now( )
AND (
now( ) <= Expires
OR Expires = '0000-00-00 00:00:00'
)
AND Section
IN (
'news'
)
ORDER BY Posted DESC
LIMIT 0 , 6

Status 	Time
starting 	0.000018
checking query cache for query 	0.000099
Opening tables 	0.000017
System lock 	0.000007
Table lock 	0.000014
init 	0.000096
optimizing 	0.000047
statistics 	0.000432
preparing 	0.000034
executing 	0.000006
Sorting result 	0.000045
Sending data 	0.000546
end 	0.000007
query end 	0.000005
freeing items 	0.000025
logging slow query 	0.000004
cleaning up 	0.000005
Showing rows 0 - 5 (6 total, Query took 0.0017 sec) [Posted: 2011-11-22 01:58:44 - 2011-11-16 03:33:26]

In PHPmyAdmin, you can issue a query and once the result is shown, try checking the ‘profile’ option. You’ll get the detailed overview as shown above.

These are the indexes I have set on the textpattern table:

Keyname	Type	Field
PRIMARY	BTREE	ID	
Posted	BTREE	Posted	
categories_idx	BTREE	Category1 (10), Category2 (10)	
section_status_idx	BTREE	Section	
Expires_idx	BTREE	Expires			
author_idx	BTREE	AuthorID	
searching	FULLTEXT		Title, Body

PS. There was NO optimizing done at all in either TXP or the MySQL setup (which is set up to use very little RAM) to achieve this result.

Last edited by ruud (2011-11-26 15:55:15)

Offline

#5 2011-11-27 06:01:02

mericson
Member
Registered: 2004-05-24
Posts: 137
Website

Re: SQL Performance of Textpattern queries

The following are my indices…

Keyname Type Cardinality Field
PRIMARY PRIMARY 693 ID
categories_idx INDEX 27
Category1 10
Category2 10
Posted INDEX 693 Posted
section_status_idx INDEX 25
Section
Status
Expires_idx INDEX 3 Expires
author_idx INDEX 2 AuthorID
searching FULLTEXT 693
Title
Body

Offline

#6 2011-11-27 12:53:40

ruud
Developer Emeritus
From: a galaxy far far away
Registered: 2006-06-04
Posts: 5,068
Website

Re: SQL Performance of Textpattern queries

Interesting. I see my TXP install doesn’t have an index on Section and Status, just on Section… but that appears to be non-standard, since more recent installs have it just like your install.

What does profiling show and/or have you tried the “explain” function to see how MySQL sees this query?

Have you tried variations on the slow query to see which part is making it slow?

For example: if the query below is much faster (it lacks the ‘order’ part), then the problem may be in a memory limit set in the MySQL configuration, causing it to use a temporary table on disk to order the results instead of doing it in memory (which still shouldn’t take 5 seconds, IMHO, but perhaps you’re on a server which is pushing disk IO limits).

select *, unix_timestamp(Posted) as uPosted, unix_timestamp(Expires) as uExpires, unix_timestamp(LastMod) as uLastMod from textpattern where 1=1 and Status = 4 and Posted <= now() and (now() <= Expires or Expires = '0000-00-00 00:00:00') and Section IN ('blog') limit 0, 6

The fact that query times vary a lot, is probably more related to server setup and usage in a shared environment. If it were in the database design, I’d expect less variation in query times.

Offline

#7 2011-11-27 15:20:05

mericson
Member
Registered: 2004-05-24
Posts: 137
Website

Re: SQL Performance of Textpattern queries

Explain tells me that the the ‘order by Posted desc’ uses filesort. There is an index on Posted, in an innodb database. MySQL shouldn’t have to use filesort, should it?

I also looked at the MySQL stats, and it appears that someone on my shared MySQL instance has lots of tablescans and ‘slow_queries’ going on… that can’t be helping me.

Offline

#8 2011-11-27 19:27:10

ruud
Developer Emeritus
From: a galaxy far far away
Registered: 2006-06-04
Posts: 5,068
Website

Re: SQL Performance of Textpattern queries

filesort doesn’t necessarily involve sorting using a file.

Offline

#9 2011-11-27 19:38:53

mericson
Member
Registered: 2004-05-24
Posts: 137
Website

Re: SQL Performance of Textpattern queries

ruud wrote:

filesort doesn’t necessarily involve sorting using a file.

Yes, I understand, it can be in memory. But what it does indicate is that the sorting isn’t being done using the index!

Offline

#10 2011-11-27 20:14:42

ruud
Developer Emeritus
From: a galaxy far far away
Registered: 2006-06-04
Posts: 5,068
Website

Re: SQL Performance of Textpattern queries

I think your knowledge of MySQL exceeds mine ;)

Offline

#11 2011-11-28 08:11:15

merz1
Member
From: Hamburg
Registered: 2006-05-04
Posts: 994
Website

Re: SQL Performance of Textpattern queries

Mark You can simply trust a vanilla Textpattern to be very very fast even without any optimization.

TXP is only the frontend for PHP and the PHP/MySQL queries. Check the PHP configuration, the network connectivity and the MySQL server.

Shared hosting packages can become very very slow very very fast.


Get all online mentions of Textpattern via OPML subscription: TXP Info Sources: Textpattern RSS feeds as dynamic OPML

Offline

#12 2011-11-30 00:35:58

Vienuolis
Member
From: Vilnius, Lithuania
Registered: 2009-06-14
Posts: 328
Website GitHub GitLab Mastodon Twitter

Re: SQL Performance of Textpattern queries

I am running a TxP site on my own server — quite robust, clear, and still almost empty rack machine. With the same problem: from 0.2 sec runtime for an individual page to 10 sec for section, category, query pages. And with no debugging errors, no communication lags. On.lt.

Offline

#13 2011-11-30 06:09:07

Gocom
Developer Emeritus
From: Helsinki, Finland
Registered: 2006-07-14
Posts: 4,533
Website

Re: SQL Performance of Textpattern queries

Vienuolis wrote:

I am running a TxP site on my own server — quite robust, clear, and still almost empty rack machine. With the same problem: from 0.2 sec runtime for an individual page to 10 sec for section, category, query pages. And with no debugging errors, no communication lags. On.lt.

Well, you are running 129 queries and have some big code heavy loops on the page (i.e. category_lists, multiple article tags). Big reason for slowness I see is that you are collecting all keywords from all articles 32 times, caused by tru_tags.

Instead of calculating tag weights for each article, you could just list the keywords from the current article. That total 1 second query time is coming from tru_tags alone. One <txp:tru_tags_from_article /> takes 0.03 in queries alone (as it seems to pick all keywords from every article for calculations).

One small things is that you seem to fetch host names of your visitors. That slows the page down too. You could turn Use DNS? off in Advanced Preferences. Also you could empty (if any) Spam blacklists (comma-separated) preference. Checking spamlists from the blacklist providers slows individual article pages.

Last edited by Gocom (2011-11-30 06:12:25)

Offline

#14 2011-11-30 08:25:14

merz1
Member
From: Hamburg
Registered: 2006-05-04
Posts: 994
Website

Re: SQL Performance of Textpattern queries

From tru_tags instructions #tru_tags_from_article

Be careful, however, before turning this (useoverallcounts=“1”) on. This attribute causes tru_tags to do an extra database query for each article displayed on a page. That extra query is equivalent to the query used to generate the overall cloud. This can cause a significant load increase on your server.

Last edited by merz1 (2011-11-30 08:26:09)


Get all online mentions of Textpattern via OPML subscription: TXP Info Sources: Textpattern RSS feeds as dynamic OPML

Offline

#15 2011-12-02 00:31:20

Vienuolis
Member
From: Vilnius, Lithuania
Registered: 2009-06-14
Posts: 328
Website GitHub GitLab Mastodon Twitter

Re: SQL Performance of Textpattern queries

I am sorry, I am late. Jukka, Markus, thanks a lot for your advices. I am trying to adopt TxP for quite complex, large-scale scientific publishing, and I see the great potential of Textpattern.

Instead of calculating tag weights for each article, you could just list the keywords from the current article.

Hm, I have set <txp:tru_tags_from_article /> without useoverallcounts, with no attributes at all. I expect the weight calculation in a tag cloud only — by generating a subject index, on a separate webpage for <txp:tru_tags_cloud />. What is the reason of keywords (tags), if not linking them to its indexes, without subject indexing? I just do not understand the need for SQL queering despite of context — I expect for query only by clicking the tag, since keywords are already specified by an author of every article. BTW, an overall tag cloud loads quite quickly, although it is relatively big.

You could turn Use DNS? off in Advanced Preferences. Also you could empty (if any) Spam blacklists

I have triggered DNS, with no significant effect. And comments are switched off completely in favour of Disq.us.

If I did not miss something important, the main improvement of Textpattern for large-scale publishing would flat-file disk cache implementing, IMHO.

Offline

Board footer

Powered by FluxBB