Go to main content

Textpattern CMS support forum

You are not logged in. Register | Login | Help

#1 2020-10-08 10:56:22

etc
Developer
Registered: 2010-11-11
Posts: 3,934
Website

RFC: extracting distinct values

Many popular questions on the forum can be formulated as ‘how to extract distinct values from articles lists’. For example, you might want to output all authors having published in a given section during the last year.

The typical solution is using if_different tag (I omit break etc for simplicity):

<txp:article_custom section="articles" time="-1 year" sort="AuthorID">
    <txp:if_different>
        <txp:author />
    </txp:if_different>
</txp:article_custom>

This works, but is not optimal, because

  • even if only one author was active, the parser must loop through all matched articles, consuming time;
  • while we need only AuthorID here, all article fields will be retrieved from db, unnecessarily consuming memory and time;
  • one can not control the number of authors via limit and offset attributes, which will act on articles instead.

Whence my proposal (actually already implemented in 4.8.4, but can yet be revoked): add an attribute that will serve to extract only desired fields from the db, ensuring that the extracted rows are distinct. Currently this attribute is called fields and works like this:

<txp:article_custom section="articles" time="-1 year" fields="authorid">
    <txp:author />
</txp:article_custom>

You can test it on demo site. If judged useful, I will explain the feature more in depth.

Offline

#2 2020-10-08 11:07:34

gaekwad
Admin
From: People's Republic of Cornwall
Registered: 2005-11-19
Posts: 3,169

Re: RFC: extracting distinct values

+1 for useful. I won’t be able to use it in anger right now, but I can see a need for a future project I’m spec’ing out.

Offline

#3 2020-10-08 13:44:45

Pat64
Plugin Author
From: France
Registered: 2005-12-12
Posts: 1,423
Website

Re: RFC: extracting distinct values

+1


Patrick.

Github | CodePen | Codier | Simplr theme | Wait Me: a maintenance theme | [\a mi.ni.ma]: a “Low Tech” simple Blog theme.

Offline

#4 2020-10-08 15:06:41

Myusername
Member
Registered: 2019-12-12
Posts: 73

Re: RFC: extracting distinct values

+1

Offline

#5 2020-10-09 11:18:42

etc
Developer
Registered: 2010-11-11
Posts: 3,934
Website

Re: RFC: extracting distinct values

As by-product, including thisid in fields list allows for processing all matching articles (since thisid is unique), but only with given fields populated from db. This makes menus constructions less time/memory expensive, since we don’t need to extract articles body etc:

<txp:article_custom fields="thisid, title, section, url_title">
    <txp:permlink><txp:title /></txp:permlink>
</txp:article_custom>

This can become even more important in the custom-fields branch, where the custom fields are stored in separate tables.

Offline

#6 2020-10-09 15:59:04

Pat64
Plugin Author
From: France
Registered: 2005-12-12
Posts: 1,423
Website

Re: RFC: extracting distinct values

Gorgeous!

Is that affect only article (& article custom) tags?


Patrick.

Github | CodePen | Codier | Simplr theme | Wait Me: a maintenance theme | [\a mi.ni.ma]: a “Low Tech” simple Blog theme.

Offline

#7 2020-10-09 19:34:30

etc
Developer
Registered: 2010-11-11
Posts: 3,934
Website

Re: RFC: extracting distinct values

Pat64 wrote #326283:

Is that affect only article (& article custom) tags?

Yes, at the moment, but it should be straightforward to extend it to other content types.

Offline

#8 2020-10-09 20:44:00

etc
Developer
Registered: 2010-11-11
Posts: 3,934
Website

Re: RFC: extracting distinct values

Here is a little syntactic question to answer.

Suppose we want to output recently active authors with their latest article publish dates. The straightforward

<txp:article_custom time="-1 year" fields="authorid" sort="Posted DESC">
    <txp:author /> (<txp:posted />)
</txp:article_custom>

will not work, since posted field is not populated. If we add it

<txp:article_custom time="-1 year" fields="authorid, posted" sort="Posted DESC">
    <txp:author /> (<txp:posted />)
</txp:article_custom>

all distinct (authorid, posted) pairs will be output, not only the most recent one. Currently the problem is solved via

<txp:article_custom time="-1 year" fields="authorid, posted=max" sort="Posted DESC">
    <txp:author /> (<txp:posted />)
</txp:article_custom>

The question is: is posted=max syntax clear enough? What would you suggest otherwise?

Offline

#9 2020-10-10 09:59:26

zero
Member
From: Lancashire
Registered: 2004-04-19
Posts: 1,370
Website

Re: RFC: extracting distinct values

max isn’t so clear to me. Perhaps late and early?


Wondrous Healing Wondrous ways to a healthy old age
Safe Reiki Harmless natural healing.
Gud One Blog

Offline

#10 2020-10-10 10:03:25

etc
Developer
Registered: 2010-11-11
Posts: 3,934
Website

Re: RFC: extracting distinct values

zero wrote #326310:

max isn’t so clear to me. Perhaps late and early?

No :-) because it should be applicable to any field, not only dates, e.g. max(comments_count).

Offline

Board footer

Powered by FluxBB