By Lambert Strether of Corrente.
Yves and I have spent an inordinate amount of time over the last year trying to find a satisfactory web host. Yves estimates the time she’s spent on this pleasant duty would be equivalent at least half, perhaps as much as 2/3 of the time it would take to write a book, so the opportunity cost has been significant. Leaving aside the stress not just of the search, but the reason we have to make the search: The site keeps going down at unpredictable times for unpredictable intervals. (Actually, that’s not quite correct: I can recall at least two outages right before Links were about to go live! Perhaps the site senses the stress of the operator…)
So, first, I’m going to go over our experiences so you understand what we’re going through; then I’m going to look at the bundle of services involved in web hosting, and solicit your views on the bundling approach, and for possible vendors.
We examined a large number of hosts seriously and got into extended due diligence and negotiations with four, to the point of scheduling data transfers that were aborted with two, and completed with one (see this post on WP Engine to give you an idea of how time-consuming and surreal — in a word, devolved — our experiences with web hosting companies have been).
Our current tech dude, who both administers WordPress (WP) and runs the server, initially seemed to be a step in the right direction, since he tuned the site to improve performance, which had been an issue, and was generally quick to act when we had spambot attacks, by updating the server’s blacklist (though it’s not clear why CloudFlare, which we use, needed the assistance). In fact, we’d been hopeful that any problems could be worked through methodically; until the last, very painful, multi-hour outage.
Our previous tech dude bundled web hosting functionality in a different way: He not only administered WP but handled plug-ins and design, and also set us up with a self-managed server at a perfectly reliable big iron shop. Then, however, when the site went down, the big iron guys would say they saw no activity spike and no reason for the site to have crashed on their end. They’d restart the server, and it would fall over again. So Yves would have to get the tech dude, who did know WP, to flush out whatever the cause of the crash was, and fix it. The problem: He was not 24/7, and so if the site when down when he was unavailable, NC could be out for a few hours, which would make Yves crazy.
But the root of all the problems seems to be WordPress. If Microsoft built a blogging platform, it would be WordPress: feature-bloated, breakage prone, and demanding constant updates. (The weakest part of WordPress, where it notoriously scales badly, is its database design. And because Naked Capitalism updates frequently, thanks to our active comments section and our refusal to turn that functionality over to the justly hated Disqus, we run the database very hard.)
Now, readers might logically suggest a hosting service that markets itself to the WordPress community as a solution, particularly since the big ones, like WP Engine (tried them, offer 24/7 service. However, based on our experience, we are now increasingly of the view that the WordPress hosting model might be a ruse to generate large upcharges for service that is not substantively very different.
For instance, WP Engine makes great claims about being optimized for performance, as in speed, and while that is nice to have, our big concern is uptime. But if the site goes down due to some weird seize-up in WordPress, our assumption is that 80% of the time it is probably something that isn’t complicated (as in there are 3-5 obvious things to check) and that a not-hugely-skilled WordPress person could bring http back up, or reboot the database, and clear out all the caches, run database integrity checks, etc. Our expectation was that someone who marketed themselves as a WP host would run those basic checks if a site fell over and implement those simple correctives.
But that’s isn’t what WP Engine offers. They told us that (basically) they guaranteed 100% uptime from the LAMP stack*, but if WP seized up, that was our problem. The most they would do is walk us through some troubleshooting. So WP Engine’s value-add was where, exactly? Anyhow, the site can fall over when Yves and I are not on duty, and we’d expect a host that markets themselves as WP host to be willing to do basic correctives if we gave them permission to intervene. But n-o-o-o-o-o!’
So now the solicitation part:
As you can see, we’ve tried bundling — unbundling, debundling, rebundling — the following tasks in several ways across several vendors over time:
 WordPress Plugins, Design, and Maintenance. Here I think we’ve come out of the stormy seas into the safe harbor [touch wood]. The site looks classy, we have a ticketing system, and so on.
 Server Hosting (the big iron). In essence, the LAMP stack. Building it, keeping it current, restarting it on demand. This is a 24/7 task in that a warm body must be available to handle a restart.
 WordPress Tuning. By tuning, I mean making sure the various aspects of the LAMP stack are tuned to our load, speed, and reliability requirements, in addition to making sure that settings on the WordPress dashboard are as we would wish. Our current tech dude seems have addressed this effectively, but it’s hard to know how much is WordPress turning specifically, and how much is due to CloudFlare or other server administration factors.
 Server Administration. Administering the entire LAMP stack for load, speed, and reliability requirements beyond those of WordPress; for example, security, preventing spam attacks, and so on. This is a 24/7 task that requires a level-headed and skilled technician, who must also understand the interactions between generic LAMP stack optimizations and the WordPress Tuning at #2. It would be possible to unbundle this task as follows:
[4a] Routine Service Call: 24/7 crash solution: Bring http back up, and/or reboot the database, and/or clear out all the caches, run database integrity checks. I (lambert) do this sort of thing at my own blog, but I’m not 24/7 and in any case I should probably be blogging.
[4b] Technical Service Call: Given a crash, why did it happen? What, in tasks #2, #3, or #4 must be adjusted to prevent a crash from happening again?
We could, I suppose, find one vendor that bundled tasks  through  (since  is covered, for now): A managed server with 24/7 support whose tech people had decent WordPress tuning chops. However, the word “saga” wouldn’t be in the headline if we’d managed to find that one vendor, and we’ve tried, as you see.
Or we could create our own bundle from several vendors. For example, we could find our own Big Iron at , call in a WordPress Tuner, rather as one would call in a piano tuner, for a consulting fee at , and then find a 24/7 Administration solution at  (would have to be at least two people, maybe one the tech lead to handle [4b], and then two for the evening and night shifts at [4a] (depending on time zones). The downside, of course, is that we have more co-ordination to do.
Readers, does this breakdown of task bundles seem reasonable? What would you add? Or substract? And crucially, can you suggest any vendors for the tasks, bundled whichever way?
NOTE * LAMP Stack (adapted from): They call it a stack is because each layer builds on the layer beneath it. Your Operating System, inux, is the base layer. Then pache, your web server, sits on top of Linux. Then your database, ySql (or ariaDB or ongoDB…) stores all the data served by Apache, and PHP (or erl or HP….) is used to drive and display all the data as web pages, and handle user interactions that build new pages.
NOTE ** The trade-off with CloudFlare is speed vs. time: Pages are cached up there in the Cloud somewhere, which makes delivering them to you faster, but comments may take a few minutes to make it into the cache.
NOTE This is not the post to raise concerns about site issues other than those already raised. It’s most definitely not the post to comment on typefaces, the way the site looks on this or that machine or platform, or other look and feel issues.
Also, I’m deliberately not including site specs. The “You give me the specs, I give you the quote” dance has been a massive #FAIL throughout this saga. Reliable 24/7 support, as in tasks  and , is the key here. Probably anybody whose business is adequately resourced to handle them can handle our throughput.