Go to main content

Textpattern CMS support forum

You are not logged in. Register | Login | Help

#1 2010-01-02 13:14:29

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 11,433
Website GitHub

smd_xml : extract data from XML feeds

The first plugin of the new decade arrives. This one is a kind of generic XML processor. Give it a URL (e.g. a feed) that returns well-formed XML and then use the plugin to filter out stuff from the various records.

  • Include, or exclude any node information
  • Automatically extract XML attribute data and manipulate it
  • Use a Form or the plugin container to restructure / output data you have extracted
  • Add custom pagination to your document to allow visitors to step through your data

It uses the, by now, familiar {replacement tag} syntax that wet pioneered and I stole so you can reformat the information from the XML document for your own purposes. Although untested, an interesting experiment might be to grab feed data from somewhere on the web and use smd_query in the container to INSERT parts of the data into your TXP database. As far as I know there are no restrictions on what you can and can’t grab from the XML document so if you can see it, you can muck about with it.

Download the plugin and get XMLing.

As ever, post any thoughts, improvements, bugs, praise, or flaming pitchforks here and I’ll tend to the village.

Happy New Year!

Revision history
————————

All available versions and changes are listed here. Each entry indexes the relevant post(s) in the thread to learn about the features.

  • 08 Oct 2019 | 0.4.3 | Register missed conditional tags with parser.
  • 08 Oct 2019 | 0.4.2 | Maintenance update for PHP5+ and Textpattern 4.6+
  • 06 Oct 2014 | 0.4.1 | Add support for customisable headers (thank johnno)
  • 03 Apr 2012 | 0.4.0 | Improved feed support and tag detection for more varied / complicated feeds ; added XML-over-FTP support (thanks aslsw66) ; added SOAP transport facility, transport_opts and transport_config attributes ; added XSL and regex transform support ; allowed sub->field support and added match, ontagstart, ontagend and load_atts for finer control over field extraction ; added datawrap, var_prefix and timeout attributes ; added record attribute support (thanks Mats) ; fixed mangled date field bug ; fixed attributes-in-record-entry limit bug and undesired ontag output (both thanks tye) ; changed format’s escape attribute to fordb (escape is now for htmlspecialchars()) ; added kill_spaces so inter-tag whitespace removal is optional (but highly recommended) ; added tag_delim (thanks MattD)
  • 17 Jan 10 | 0.3.0 | Enabled URL params to be passed in the data attribute ; added format (thanks photonomad) ; deprecated linkify ; param_delim default is now pipe
  • 13 Jan 10 | 0.2.2 | Added line_length (thanks nardo)
  • 05 Jan 10 | 0.2.1 | Added defaults, set_empty and transport ; fixed https support (thanks photonomad)
  • 03 Jan 10 | 0.2.0 | Added cached data (thanks variaas) ; added pagination and limit/offset ; added linkify (thanks Jaro)
  • 02 Jan 10 | 0.1.0 | Initial release

Last edited by Bloke (2019-10-08 15:08:14)


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#2 2010-01-02 14:58:32

variaas
Plugin Author
From: Chicago
Registered: 2005-01-16
Posts: 402
Website

Re: smd_xml : extract data from XML feeds

Looks awesome – any plans for caching capabilities? Pinging Twitter every page load can become excessive.

Offline

#3 2010-01-02 19:09:57

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 11,433
Website GitHub

Re: smd_xml : extract data from XML feeds

variaas wrote:

any plans for caching capabilities?

Hadn’t thought about it, but it makes sense. Leave it with me.


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#4 2010-01-02 21:11:23

LetterHoofd
Member
From: Kortrijk, BE
Registered: 2006-01-20
Posts: 40
Website

Re: smd_xml : extract data from XML feeds

This plugin looks very promising. Might be a true timesaver.

Offline

#5 2010-01-03 01:56:14

jan
Member
From: Utrecht, The Netherlands
Registered: 2006-08-31
Posts: 71
Website

Re: smd_xml : extract data from XML feeds

I was exactly looking for this when I stumbled upon it being the newest submission, awesome :)
Tried it out really quickly, and seems to work very well!

One feature that may be useful: a limiter on the amount of entries you want to import.
That way you could, for example, show the last 3 posts from some blog on your site.

If you agree, then perhaps an offset parameter is possible too? (Since that could be added without much extra effort).

If the things I’m suggesting are already possible, slap me, it’s already 3 AM.. :-)

Anyway, good job!

Last edited by jan (2010-01-03 01:56:54)


Kensington TXP powered rock

Offline

#6 2010-01-03 02:04:22

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 11,433
Website GitHub

Re: smd_xml : extract data from XML feeds

jan wrote:

I was exactly looking for this when I stumbled upon it being the newest submission, awesome :)

Excellent, glad it’s potentially useful.

… a limiter on the amount of entries you want to import.

Hehe, are you a mind reader? I’m adding limit, offset and paging features to the plugin right now so you can step through the records if you want :-) It’s a bit of a cheat because you can’t very easily grab part of an XML document, but it seems to work.

Variaas’ cache capability is coded and working already, so watch this space…

it’s already 3 AM

Only 2am here: the night is young… :-)


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#7 2010-01-03 02:13:57

jan
Member
From: Utrecht, The Netherlands
Registered: 2006-08-31
Posts: 71
Website

Re: smd_xml : extract data from XML feeds

Haha, maybe I imported an xml feed of your thoughts?
Anyway, I “fell with my nose in the butter” like the Dutch say.
Curious for 0.2! :D

Last edited by jan (2010-01-03 02:14:41)


Kensington TXP powered rock

Offline

#8 2010-01-03 11:30:14

Jaro
Member
From: S/F
Registered: 2004-11-18
Posts: 89

Re: smd_xml : extract data from XML feeds

Great plugin!

Would there be a way to define a date/time format in the output? Also it would be great if all links in the output would be automatically clickable (e.g. bit.ly links used in Twitter).

Offline

#9 2010-01-03 11:36:43

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 11,433
Website GitHub

Re: smd_xml : extract data from XML feeds

Jaro wrote:

Great plugin!

Ta. Documenting the next version today…

Would there be a way to define a date/time format in the output?

Not yet. I did think it would be useful, but couldn’t think of a decent way of doing it. Any ideas how to best specify it without making it too messy or clumsy?

it would be great if all links in the output would be automatically clickable (e.g. bit.ly links used in Twitter).

Saw your other post and thought it was a good idea, so I’m workin’ on it!


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#10 2010-01-03 11:49:42

Jaro
Member
From: S/F
Registered: 2004-11-18
Posts: 89

Re: smd_xml : extract data from XML feeds

Bloke wrote:
Not yet. I did think it would be useful, but couldn’t think of a decent way of doing it. Any ideas how to best specify it without making it too messy or clumsy?

No, not really, sorry. I’m not really good with php.

I noticed one thing. When I active this plugin I get a warning:

Warning: Call-time pass-by-reference has been deprecated in C:\wamp\www\mysite\textpattern\lib\txplib_misc.php(594) : eval()'d code on line 182

I’m running 4.2.0 (r3275) on localhost.

Offline

#11 2010-01-03 11:55:58

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 11,433
Website GitHub

Re: smd_xml : extract data from XML feeds

Jaro wrote:

@Warning: Call-time pass-by-reference has been deprecated

What version of PHP are you running? Probably something I should look into, thanks for the heads up.


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#12 2010-01-03 12:10:57

Jaro
Member
From: S/F
Registered: 2004-11-18
Posts: 89

Re: smd_xml : extract data from XML feeds

Bloke wrote:

What version of PHP are you running? Probably something I should look into, thanks for the heads up.

I’m running PHP 5.2.9-1. Let me know if I can help you any further to debug this.

Last edited by Jaro (2010-01-03 12:11:08)

Offline

Board footer

Powered by FluxBB