Textpattern CMS support forum
You are not logged in. Register | Login | Help
- Topics: Active | Unanswered
Sites outage, January 17th 2020
Overview
Textpattern sites were offline from 0630 to 1015 UTC today. They are now back online.
I am investigating when they went down, and why. This post thread will be updated as I find out more.
Last edited by gaekwad (2020-01-17 10:40:22)
Offline
Re: Sites outage, January 17th 2020
On checking the PHP 7.3 error log, it was a zero-byte file. The php.ini
file that is reconfigured to use a custom error log was set incorrectly, and as a result logging was turned off. This has been rectified.
On some other servers I maintain, I have implemented a scheduled Nginx + PHP service restart if something is not working correctly (i.e if you’re seeing a 502
error, bounce the services and alert me that it’s down). This works very well elsewhere, and before our recent outages I was intending to implement this in spring as part of the 2020 server build out.
Given the two most recent outages in a week, I’ll bring this forward and start work on it tonight after work, so we should have much more automated resiliency.
Edits: words and numbers.
Last edited by gaekwad (2020-01-17 18:24:54)
Offline
Re: Sites outage, January 17th 2020
Hi pete,
I did experience the outage this morning. I was meaning to write but I had one meeting after the next. I’m glad that you are on top of it, and thankful to all te work you are doing.
Yiannis
——————————
NeMe | hblack.art | EMAP | A Sea change | Toolkit of Care
I do my best editing after I click on the submit button.
Offline
Re: Sites outage, January 17th 2020
gaekwad wrote #321167:
On some other servers I maintain, I have implemented a scheduled Nginx + PHP service restart if something is not working correctly (i.e if you’re seeing the
503
error, bounce the services and alert me that it’s down).
Just FYI, the error I was getting was 502: Bad Gateway
.
colak wrote #321172:
I’m glad that you are on top of it, and thankful to all the work you are doing.
I second that 👍
TXP Builders – finely-crafted code, design and txp
Offline
Re: Sites outage, January 17th 2020
jakob wrote #321176:
Just FYI, the error I was getting was
502: Bad Gateway
.
I meant 502
, typo on my part – I was rushing to get it all back online.
Anything in the 50*
region is usually PHP misbehaving with Nginx, or vice versa. Now that PHP logging is working properly (ahem) I should have more idea of what’s happening, so I can diagnose properly next time.
Offline
Re: Sites outage, January 17th 2020
OK, I’ve installed the restart-if-down scripts. This should reduce unexpected downtime to almost nothing.
For anyone interested: there’s a cron task that runs every 5 minutes, it uses curl
to check the headers of a loopback-only website for a status report. If the site is up (200 OK
), it does nothing. If it’s anything else, it gracefully restarts the web server and PHP processes.
Offline
Re: Sites outage, January 17th 2020
Clever.
The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.
Txp Builders – finely-crafted code, design and Txp
Offline
Re: Sites outage, January 17th 2020
Bloke wrote #321202:
Clever.
It’s one of a few things I thought up that I’m really quite pleased with. Took me a while to iron out the kinks, and I’m sure I can make it more efficient, but it seems to work.
Offline
Re: Sites outage, January 17th 2020
Aside: my username at gmail.com is open for reports if things are broken, it’s a different email address to my proper forum email, but it’s easier to remember. Assume I don’t know about broken things, and a notification/alert email is welcome.
Offline
Re: Sites outage, January 17th 2020
I’ve boosted the resources allocated to PHP on our sites, which should further reduce the possibility of outages.
Please let me know if things are exploding in flames and screaming not working as you expect.
Offline