Go to main content

Textpattern CMS support forum

You are not logged in. Register | Login | Help

#16 2009-05-28 05:25:26

wet
Developer Emeritus
From: Vöcklabruck, Austria
Registered: 2005-06-06
Posts: 3,416
Website GitHub Mastodon

Re: Full page caching in TXP core

merz1 wrote:

[…]

  • Better cache control for plug-ins via core hooks.
  • Cache integrity: A single cache control center via hooks. Example: Every plug-in using external sources (or embedded PHP applications) eg SimplePie, the RSS parser, could read a ‘purge signal’ to delete the RSS cache.
  • A forced cache option for pages and single articles: Save and write to cache, save and don’t cache.
  • More granularity: Cache this section, don’t cache that section.
  • User cache: Own cache content/directory for logged-in users.
  • Fast preview in edit mode: Save, cache & view draft versions much faster. (The draft preview mode is awesome but the time it takes to save & view the article sometimes sucks.)
  • Collision handling for plug-ins. Example: A plug-in reading referrers and rendering parts of the page depending on a referrer could throw a warning (admin side).
  • Different cache lifetime cycles. Example: Cached CSS files only need to be updated by a forced refresh or by saving a new version.

With just one user’s (well thought-out) specs, we already have a massive feature set with a multitude of configuration options. OTOH, it’s a niche requirement so I’d rather see this as an optional component than as an extra mandatory burden for small applications.

From my POV, implementing signals and hooks into the core to send and receive abstract caching hints to/from plugins would be the way to go. For the sending part, all instances of update_lastmod() might serve as a starting point. Any concrete implementation should be added by plugins.

Offline

#17 2009-05-28 11:37:12

merz1
Member
From: Hamburg
Registered: 2006-05-04
Posts: 994
Website

Re: Full page caching in TXP core

Thanks Robert! I never said “mandatory” :)

I would like to get feedback regarding the single points on the list. I think that there are no too many additional feature requests.

From my POV, implementing signals and hooks into the core to send and receive abstract caching hints to/from plugins would be the way to go. For the sending part, all instances of update_lastmod() might serve as a starting point. Any concrete implementation should be added by plugins.

I agree that TXP core should have ‘signals and hooks’.

Q: I have no idea if my main full page cache feature request ‘look for HTML files via .htaccess rule first’ can be accomplished by a plug-in?
Q: Same question for ‘cache this, don’t cache that’?

(If plug-in solution:) It would be great if the TXP core changes implement a check for a present (active cache) plug-in including the option to control the on/off state.

  1. Cache is active and can be switched off
  2. Cache is active and can be purged
  3. Cache is deactivated and can be switched on

A new full page cache (or also a partial page cache) solution/concept (via plug-in or not) should at least implement that users don’t have to edit core files like the main index.php. This seems to be the biggest burden for asy_jpcache for new TXP installations or after TXP updates. A 2nd benefit would be that diagnosis doesn’t throw a ‘changed file’ warning.


Get all online mentions of Textpattern via OPML subscription: TXP Info Sources: Textpattern RSS feeds as dynamic OPML

Offline

#18 2009-05-28 14:52:45

hcgtv
Archived Plugin Author
From: Key Largo, Florida
Registered: 2005-11-29
Posts: 2,722
Website

Re: Full page caching in TXP core

artagesw wrote:

Bert, what switches would you pass to wget to tell it to skip any pages with interactive content (such as a contact form)?

You can use the —reject switch to avoid certain patterns in file names – Types of Files.

Offline

#19 2009-05-28 17:20:04

artagesw
Member
From: Seattle, WA
Registered: 2007-04-29
Posts: 227
Website

Re: Full page caching in TXP core

hcgtv wrote:

You can use the —reject switch to avoid certain patterns in file names – Types of Files.

Cool. That looks fairly powerful, actually.

Offline

#20 2009-05-28 20:05:33

artagesw
Member
From: Seattle, WA
Registered: 2007-04-29
Posts: 227
Website

Re: Full page caching in TXP core

So let’s say I wanted to set up an automated self-updating static cache using wget. How would I set that up?

The main issue I see is how to keep the static cache fresh.

Let’s say you set up a cron job to wget the entire site every 10 minutes.

1. If wget is just hitting the main URL for the site in order to build the static cache, then wget would itself retrieve files from the static cache once it has been created. (It would end up getting files served from the cache just as any browser would.) So, it would never update existing files. In other words, how does the cache become invalidated?

2. Let’s say we address (1) by simply blowing away the entire static cache prior to each wget (a crude but effective solution). Now, how often should the cron job run? A 10 minute interval would mean on average, a stale version of a changed page would be served for 5 minutes before being refreshed. Running it more frequently could put undesirable extra load on the web server.

3. This is clearly lot less efficient than an intelligent caching system that can invalidate/update just a single page that has changed, and do so instantly.

4. Also, it’s a lot less user-friendly (harder to manage, etc.) than a built-in caching solution would be.

So, although I agree you could put together a workable “poor man’s cache” with the above techniques, and it might work fine for some (smallish) sites, I don’t think it approaches the utility of a fully baked-in solution.

Just my 2 cents.

Offline

#21 2009-05-28 23:38:02

hcgtv
Archived Plugin Author
From: Key Largo, Florida
Registered: 2005-11-29
Posts: 2,722
Website

Re: Full page caching in TXP core

artagesw wrote:

So let’s say I wanted to set up an automated self-updating static cache using wget. How would I set that up?

Write a script and run it from a plugin in the admin area. The publisher of the site would trigger the wget script when the site is changed by either adding or changing an article or by modifying the look and feel.

You could also create a plugin that would monitor certain changes in the backend and would itself be the trigger, but what if you need to create an article and also tweak the CSS, how would the plugin know when you’re done with your changes?

Another option is to run the script via a cron job every 10 minutes, it would check for the existence of a file in a directory. If the file exists, the site was changed, so run wget and wipe out the file when done.

1. In other words, how does the cache become invalidated?

Check out how wget uses time-stamping, it checks the timestamp of the html file in cache against what your site returns via the Last-Modified header. See Send “Last-Modified” header? in the advanced preferences.

So, although I agree you could put together a workable “poor man’s cache” with the above techniques, and it might work fine for some (smallish) sites, I don’t think it approaches the utility of a fully baked-in solution.

Yes, it doesn’t come close to a full core solution but it could help a site sustain a Digg effect.

Offline

#22 2009-05-28 23:51:10

artagesw
Member
From: Seattle, WA
Registered: 2007-04-29
Posts: 227
Website

Re: Full page caching in TXP core

hcgtv wrote:

Check out how wget uses time-stamping, it checks the timestamp of the html file in cache against what your site returns via the Last-Modified header. See Send “Last-Modified” header? in the advanced preferences.

OK, so you are assuming that the “public facing” domain for the site is being served directly from the cache directory, and wget is fetching the underlying Textpattern site via a private IP? Otherwise, wget is not going to see any Last-Modified headers. If apache is serving pages from the cache (via mod-rewrite rules), then wget is also going to be fed from the cache – not from Textpattern…

Offline

#23 2009-05-29 00:15:22

hcgtv
Archived Plugin Author
From: Key Largo, Florida
Registered: 2005-11-29
Posts: 2,722
Website

Re: Full page caching in TXP core

artagesw wrote:

OK, so you are assuming that the “public facing” domain for the site is being served directly from the cache directory, and wget is fetching the underlying Textpattern site via a private IP?

Yes, you have to point wget at your dynamic Textpattern site so it can build the static pages. Whatever means you may employ, private IP, subdomain for live site or using .htaccess to detect wget and feed it the live site while serving static site to every other user agent.

We can also make a special user agent for wget, should you want to be sure to feed the live site to your own wget process as opposed to someone using wget to leech your site.

Offline

#24 2009-05-29 00:30:34

artagesw
Member
From: Seattle, WA
Registered: 2007-04-29
Posts: 227
Website

Re: Full page caching in TXP core

hcgtv wrote:

Yes, you have to point wget at your dynamic Textpattern site so it can build the static pages. Whatever means you may employ, private IP, subdomain for live site or using .htaccess to detect wget and feed it the live site while serving static site to every other user agent.

OK, so now the issue of interactive pages creeps back in. If the public domain is pointing at the static site, and we have configured wget to ignore certain interactive pages (forms and such), then how does the user reach those pages? Mod-rewrite on a per-URL basis?

Offline

#25 2009-05-29 06:31:34

hcgtv
Archived Plugin Author
From: Key Largo, Florida
Registered: 2005-11-29
Posts: 2,722
Website

Re: Full page caching in TXP core

artagesw wrote:

OK, so now the issue of interactive pages creeps back in. If the public domain is pointing at the static site, and we have configured wget to ignore certain interactive pages (forms and such), then how does the user reach those pages? Mod-rewrite on a per-URL basis?

Mod-rewrite on a per-URL basis sounds about right, if a user tries to access mysite.com/contact, then the dynamic site is used, wget would be denied access to the contact form.

We keep throwing ideas back and forth, and I’d like to apologize to Markus for continuing this hypothetical wget based full page caching mechanism idea afloat.

Offline

#26 2009-05-29 07:12:33

artagesw
Member
From: Seattle, WA
Registered: 2007-04-29
Posts: 227
Website

Re: Full page caching in TXP core

Yeah, didn’t mean to hijack his thread. But I think we’re developing some useful ideas here that he or someone else may be able to put to good use. :)

Offline

#27 2009-05-29 14:34:00

hcgtv
Archived Plugin Author
From: Key Largo, Florida
Registered: 2005-11-29
Posts: 2,722
Website

Re: Full page caching in TXP core

artagesw wrote:

Yeah, didn’t mean to hijack his thread. But I think we’re developing some useful ideas here that he or someone else may be able to put to good use. :)

Yeah, I was thinking along the same lines. Whatever full page mechanism is put in place, it will have to deal with similar issues. I haven’t had the time but when I get a break, I’m going to play around with wget against one of my Textpattern sites.

Offline

#28 2009-05-29 22:43:41

merz1
Member
From: Hamburg
Registered: 2006-05-04
Posts: 994
Website

Re: Full page caching in TXP core

Guys, as much as I adore your creative super powers regarding the creation of an FTP driven website hijacker but could you please (PLEASE) create your own playground.

PS: Pssssst, there are many existing solutions out there … online and offline software … but may I suggest you throw a discrete look at some web 0.1 directory like Yahoo?

Edit: The dilemma of disconnection

I admit you made a valid point: If interactive pages are cached as static HTML files to a cache directory and some insane soul would like to use those static files during a maintenance breakthen we have the dilemma of disconnected pages, forms, AJAX, etc.

How can TXP solve this situation?

Some little ‘code injection’ which checks for connectivity and throws a message if the site is in ‘static maintenance mode’?

Last edited by merz1 (2009-05-29 22:52:04)


Get all online mentions of Textpattern via OPML subscription: TXP Info Sources: Textpattern RSS feeds as dynamic OPML

Offline

#29 2009-07-23 12:54:26

merz1
Member
From: Hamburg
Registered: 2006-05-04
Posts: 994
Website

Re: Full page caching in TXP core

Alternative / Inspiration / Workaround

As the web-optimizator offers auto-updates wouldn’t an optional integration into Textpattern core be good? What do the devs say?


Get all online mentions of Textpattern via OPML subscription: TXP Info Sources: Textpattern RSS feeds as dynamic OPML

Offline

#30 2009-07-23 15:24:11

wet
Developer Emeritus
From: Vöcklabruck, Austria
Registered: 2005-06-06
Posts: 3,416
Website GitHub Mastodon

Re: Full page caching in TXP core

No.

Offline

Board footer

Powered by FluxBB