Go to main content

Textpattern CMS support forum

You are not logged in. Register | Login | Help

#1 2020-10-08 10:56:22

etc
Developer
Registered: 2010-11-11
Posts: 5,523
Website GitHub

RFC: extracting distinct values

Many popular questions on the forum can be formulated as ‘how to extract distinct values from articles lists’. For example, you might want to output all authors having published in a given section during the last year.

The typical solution is using if_different tag (I omit break etc for simplicity):

<txp:article_custom section="articles" time="-1 year" sort="AuthorID">
    <txp:if_different>
        <txp:author />
    </txp:if_different>
</txp:article_custom>

This works, but is not optimal, because

  • even if only one author was active, the parser must loop through all matched articles, consuming time;
  • while we need only AuthorID here, all article fields will be retrieved from db, unnecessarily consuming memory and time;
  • one can not control the number of authors via limit and offset attributes, which will act on articles instead.

Whence my proposal (actually already implemented in 4.8.4, but can yet be revoked): add an attribute that will serve to extract only desired fields from the db, ensuring that the extracted rows are distinct. Currently this attribute is called fields and works like this:

<txp:article_custom section="articles" time="-1 year" fields="authorid">
    <txp:author />
</txp:article_custom>

You can test it on demo site. If judged useful, I will explain the feature more in depth.

Offline

#2 2020-10-08 11:07:34

gaekwad
Server grease monkey
From: People's Republic of Cornwall
Registered: 2005-11-19
Posts: 4,535
Bitbucket GitHub

Re: RFC: extracting distinct values

+1 for useful. I won’t be able to use it in anger right now, but I can see a need for a future project I’m spec’ing out.

Offline

#3 2020-10-08 13:44:45

Pat64
Plugin Author
From: France
Registered: 2005-12-12
Posts: 1,664
GitHub Twitter

Re: RFC: extracting distinct values

+1


Patrick.

Github | CodePen | Codier | Simplr theme | Wait Me: a maintenance theme | [\a mi.ni.ma]: a “Low Tech” simple Blog theme.

Offline

#4 2020-10-08 15:06:41

Myusername
Member
Registered: 2019-12-12
Posts: 165

Re: RFC: extracting distinct values

+1

Offline

#5 2020-10-09 11:18:42

etc
Developer
Registered: 2010-11-11
Posts: 5,523
Website GitHub

Re: RFC: extracting distinct values

As by-product, including thisid in fields list allows for processing all matching articles (since thisid is unique), but only with given fields populated from db. This makes menus constructions less time/memory expensive, since we don’t need to extract articles body etc:

<txp:article_custom fields="thisid, title, section, url_title">
    <txp:permlink><txp:title /></txp:permlink>
</txp:article_custom>

This can become even more important in the custom-fields branch, where the custom fields are stored in separate tables.

Offline

#6 2020-10-09 15:59:04

Pat64
Plugin Author
From: France
Registered: 2005-12-12
Posts: 1,664
GitHub Twitter

Re: RFC: extracting distinct values

Gorgeous!

Is that affect only article (& article custom) tags?


Patrick.

Github | CodePen | Codier | Simplr theme | Wait Me: a maintenance theme | [\a mi.ni.ma]: a “Low Tech” simple Blog theme.

Offline

#7 2020-10-09 19:34:30

etc
Developer
Registered: 2010-11-11
Posts: 5,523
Website GitHub

Re: RFC: extracting distinct values

Pat64 wrote #326283:

Is that affect only article (& article custom) tags?

Yes, at the moment, but it should be straightforward to extend it to other content types.

Offline

#8 2020-10-09 20:44:00

etc
Developer
Registered: 2010-11-11
Posts: 5,523
Website GitHub

Re: RFC: extracting distinct values

Here is a little syntactic question to answer.

Suppose we want to output recently active authors with their latest article publish dates. The straightforward

<txp:article_custom time="-1 year" fields="authorid" sort="Posted DESC">
    <txp:author /> (<txp:posted />)
</txp:article_custom>

will not work, since posted field is not populated. If we add it

<txp:article_custom time="-1 year" fields="authorid, posted" sort="Posted DESC">
    <txp:author /> (<txp:posted />)
</txp:article_custom>

all distinct (authorid, posted) pairs will be output, not only the most recent one. Currently the problem is solved via

<txp:article_custom time="-1 year" fields="authorid, posted=max" sort="Posted DESC">
    <txp:author /> (<txp:posted />)
</txp:article_custom>

The question is: is posted=max syntax clear enough? What would you suggest otherwise?

Offline

#9 2020-10-10 09:59:26

zero
Member
From: Lancashire
Registered: 2004-04-19
Posts: 1,470
Website

Re: RFC: extracting distinct values

max isn’t so clear to me. Perhaps late and early?


Dozy P My music
Gud One My blog

Offline

#10 2020-10-10 10:03:25

etc
Developer
Registered: 2010-11-11
Posts: 5,523
Website GitHub

Re: RFC: extracting distinct values

zero wrote #326310:

max isn’t so clear to me. Perhaps late and early?

No :-) because it should be applicable to any field, not only dates, e.g. max(comments_count).

Offline

#11 2020-10-10 10:06:59

zero
Member
From: Lancashire
Registered: 2004-04-19
Posts: 1,470
Website

Re: RFC: extracting distinct values

etc wrote #326311:

No :-) because it should be applicable to any field, not only dates, e.g. max(comments_count).

The only other one I can think of is fresh and stale


Dozy P My music
Gud One My blog

Offline

#12 2020-10-10 18:07:05

gaekwad
Server grease monkey
From: People's Republic of Cornwall
Registered: 2005-11-19
Posts: 4,535
Bitbucket GitHub

Re: RFC: extracting distinct values

etc wrote #326298:

all distinct (authorid, posted) pairs will be output, not only the most recent one. Currently the problem is solved via

<txp:article_custom time="-1 year" fields="authorid, posted=max" sort="Posted DESC">...

The question is: is posted=max syntax clear enough? What would you suggest otherwise?

If only the most recent value is desired, could limit be deployed in article_custom? The fields pair is defined, the sort gets the most recent and limit does its thing…or am I missing something?

Offline

Board footer

Powered by FluxBB