Making plugins first-class citizens

Bloke · 2015-05-09 11:46:44

etc wrote #290597:

<abc::function />

Gets my vote.

ruud · 2015-05-09 12:11:47

etc wrote #290597:

splat() is called by processTags() on each iteration, so I meant caching preg_match_all result

I don’t think that mixes well with the ability to have parsed attribute values, because you’d cache the result of the first time you parse it, while in the current implementation, the parsed value can vary each time.

etc · 2015-05-09 12:22:15

ruud wrote #290600:

I don’t think that mixes well with the ability to have parsed attribute values, because you’d cache the result of the first time you parse it, while in the current implementation, the parsed value can vary each time.

No, not values, we’d cache only arrays like ('section'=>'<txp:section />', ... ) and still process <txp:section /> on each iteration. It spares only multiple preg_match_all calls.

ruud · 2015-05-09 13:03:59

etc wrote #290601:

No, not values, we’d cache only arrays like ('section'=>'<txp:section />', ... ) and still process <txp:section /> on each iteration. It spares only multiple preg_match_all calls.

Parsing depends on the quoting type used, so you’d have to store somewhere which attributes in that array have to be parsed. Alternatively, you could choose not to cache when one of the attributes requires parsing.

Some benchmarks (doesn’t include calls to parse()). Times shown are average times in microseconds.

Zero column: how many times the same attribute set re-occurs.
First column: original splat()
Second column: splat() caching attributes except when one or more need parsing
Third column: splat caching attributes, including when parsing is needed.

$attribs = '   ';
1	1.5732	2.8905	2.9502
2	1.4850	2.1611	2.2236
5	1.4505	1.6318	1.7352
10	1.4230	1.5409	1.6043
50	1.4353	1.3568	1.4323
100	1.4302	1.3264	1.4120
1000	1.4197	1.3085	1.3962

Since empty attribute sets are very common, I think it’s safe to say that the difference here is close to zero.

$attribs = 'key="value"';
1	5.1557	6.6670	6.8195
2	5.1697	4.1464	4.2235
5	5.0831	2.4120	2.5232
10	5.0107	1.8785	1.9464
50	5.0365	1.4596	1.5477
100	4.9929	1.3663	1.4497
1000	4.9951	1.3785	1.4525

$attribs = 'some="thing" and="other" thing="here" bool="1"';
1	13.6018	15.0676	15.1140
2	13.3633	8.3727	8.4256
5	13.4126	4.1424	4.1995
10	13.3064	2.7207	2.7986
50	13.1960	1.5853	1.6812
100	13.2398	1.4421	1.5210
1000	13.1873	1.3191	1.4341

$attribs = 'some="thing"  and=\'<txp:body>\'';
1	9.3086	10.3602	11.5734
2	9.1794	10.2371	6.4980
5	9.3709	10.4906	3.4871
10	9.0563	10.2729	2.4346
50	9.0690	10.0472	1.6399
100	9.0445	10.1068	1.5272
1000	9.0088	10.1408	1.4700

As you can see, you lose time if attributes are unique (mainly the sha1 call which costs around 1 microsecond). But you definitely gain time if attributes occur multiple times. The question now is, does the reduction in time for non-unique attribute sets outweigh the loss of time for unique sets and the added code complexity?

Ideally the first column for each attribute set would have the exact same value in each row. What you see is fluctuations because of other programs claiming CPU time and a slow decrease in lower rows due to resetting of variables that happens less often there.

etc · 2015-05-09 17:13:05

ruud wrote #290603:

Parsing depends on the quoting type used, so you’d have to store somewhere which attributes in that array have to be parsed. Alternatively, you could choose not to cache when one of the attributes requires parsing.

Ah yes, we should set some do_parse flag in the array instead of strpos check.

Some benchmarks (doesn’t include calls to parse()). Times shown are average times in microseconds.

…

As you can see, you lose time if attributes are unique (mainly the sha1 call which costs around 1 microsecond). But you definitely gain time if attributes occur multiple times. The question now is, does the reduction in time for non-unique attribute sets outweigh the loss of time for unique sets and the added code complexity?

Speaking of no-attribute tags, we could even do the checking in processTags() to spare splat() call:

	if (maybe_tag($tag))
	{
		$out = $tag(ltrim($atts) === '' ? array() : splat($atts), $thing);
	}

Now, parsing few dozens of one-timer tags takes no time anyway, I wouldn’t care about them even if we loose 10-15%. The main target of these optimizations imo are constructions like

<txp:article_custom limit="999">
	<txp:if_variable name="flag" value="">...</txp:if_variable>
</txp:article_custom>

And here you get 400% acceleration without real effort.

colak · 2015-05-09 17:22:28

etc wrote #290607:

Now, parsing few dozens of one-timer tags takes no time anyway, I wouldn’t care about them even if we loose 10-15%. The main target of these optimizations imo are constructions like

<txp:article_custom limit="999">...

I use that one:)

ruud · 2015-05-09 19:58:57

To get some real world data, what I would love to see is some examples of which tags (and attributes) are used within such monstrous limit=999 constructs and what the runtime and query times are (as shown in the HTML source while in testing mode). I’ll give you one of mine:

<txp:if_different><txp:posted format="%b %d, %Y" /></txp:if_different>
<txp:permlink><txp:title /></txp:permlink>

Runtime: 0.3308
Query time: 0.056466

Using the splat() optimisations discussed here would save 5ms, which is only 18% faster (not 400%). However, this affects only a single page on the website involved, which has 2500+ other pages which do not contain any looping tags and would therefore load slightly slower.

etc · 2015-05-09 22:09:38

Nobody says your site will be 4x faster, only splat(). But, -15% of 0.33s (and another -15% from parse() optimization) on heavy pages is more perceptible than +2% of 0.1s. And it costs almost nothing, just a little extra memory.

Edit: here was my real life example. And who cares about the real life, anyway :)

Last edited by etc (2015-05-09 22:13:41)

ruud · 2015-05-09 22:40:36

Um… something is not right. “-dev” is the current development branch, “-parser” is my parser branch, “-splat” adds the splat optimisations to my parser branch.

http://undented.com/
TXP 4.5.7: Runtime: 0.1231 / Query time: 0.101193 / Queries: 46 / Memory: 881Kb
TXP 4.6-dev: Runtime: 0.1137 / Query time: 0.089166 / Queries: 46 / Memory: 919Kb
TXP 4.6-parser: Runtime:0.1136 / Query time: 0.090757 / Queries: 46 / Memory: 954Kb
TXP 4.6-splat: Runtime: 0.1147 / Query time: 0.093998 / Queries: 46 / Memory: 966Kb

Not much difference here. In all cases 0.023 seconds required excluding query time. Note the increase in memory usage.

http://undented.com/us?author=woj (hitting the 1000 articles limit)
TXP 4.5.7: Runtime: 0.3582 / Query time: 0.061736 / Queries: 29 / Memory: 1409Kb
TXP 4.6-dev: Runtime: 0.4825 / Query time: 0.070246 / Queries: 29 / Memory: 1451Kb
TXP 4.6-parser: Runtime: 0.4411 / Query time: 0.068155 / Queries: 29 / Memory: 1487Kb
TXP 4.6-splat: Runtime: 0.3846 / Query time: 0.061738 / Queries: 29 / Memory: 1495Kb

Looks like 4.6-dev somehow got a lot slower compared to 4.5.7. The optimisations I added help a little bit, but it’s still a lot worse than 4.5.7. WTF happened in -dev?!

etc · 2015-05-10 09:17:28

Strange figures, indeed, if run on the same server. Still a clear 25% gain of runtime – querytime in the second group, at the price of only 45Kb. It’s even surprising that regex parsing takes so much time, we should check if it’s used in some other loops.

Edit: if you use the default 4.6 article form, it might be heavier than your 4.5.7 form and explain the difference.

Last edited by etc (2015-05-10 09:29:02)

ruud · 2015-05-10 11:35:34

I ran this on the exact same server with an identical database. Only difference is the TXP version.

I found part of it.
322ms with the registry code in processTags
284ms without that registry code.
So that’s 40ms. But the difference between 4.5.7 and 4.6-dev was 116ms. So we’re losing more elsewhere.

etc · 2015-05-10 11:55:28

ruud wrote #290622:

registry code in processTags

To cache, for me.

Another 25% <txp:article /> acceleration: reduce the number of UDF calls. If I replace in doArticles() (v. 4.5.7)

while($a = nextRows($rs))

with

while($a = mysql_fetch_assoc($rs)) // and mysql_free_result($rs); after the loop

and cache column maps in populateArticleData() (not sure it will work for 4.6 custom fields):

function populateArticleData($rs)
{
	global $thisarticle, $production_status;
	static $column_map;
	if(empty($column_map)) $column_map = article_column_map();

	if ($production_status === 'debug') trace_add("[".gTxt('Article')." {$rs['ID']}]");
	foreach ($column_map as $key => $column) {
		$thisarticle[$key] = $rs[$column];
	}
}

I gain 25% on ~300 articles. Edit: probably the result of MySQL caching, though.

Last edited by etc (2015-05-10 12:13:21)

Textpattern CMS

Textpattern CMS support forum

#73 2015-05-09 11:46:44

Re: Making plugins first-class citizens

etc wrote #290597:

#74 2015-05-09 12:11:47

Re: Making plugins first-class citizens

etc wrote #290597:

#75 2015-05-09 12:22:15

Re: Making plugins first-class citizens

ruud wrote #290600:

#76 2015-05-09 13:03:59

Re: Making plugins first-class citizens

etc wrote #290601:

#77 2015-05-09 17:13:05

Re: Making plugins first-class citizens

ruud wrote #290603:

#78 2015-05-09 17:22:28

Re: Making plugins first-class citizens

etc wrote #290607:

#79 2015-05-09 19:58:57

Re: Making plugins first-class citizens

#80 2015-05-09 22:09:38

Re: Making plugins first-class citizens

#81 2015-05-09 22:40:36

Re: Making plugins first-class citizens

#82 2015-05-10 09:17:28

Re: Making plugins first-class citizens

#83 2015-05-10 11:35:34

Re: Making plugins first-class citizens

#84 2015-05-10 11:55:28

Re: Making plugins first-class citizens

ruud wrote #290622:

Board footer