This is Naked Capitalism fundraising week. 121 donors have already invested in our efforts to shed light on the dark and seamy corners of finance. Join us and participate via our Tip Jar or another credit card portal, WePay in the right column, or read about why we’re doing this fundraiser and other ways to donate, such as by check, as well as our current goal, on our kickoff post.
By Lambert Strether. Originally published at Corrente.
Via O’Reilly — the highly literate and excellent tech publishing company — we read the following. Note that the grey-haired tech guy has pronounced judgment in the headline:
What Developers Can Learn from healthcare.gov
… Remember, even a failure can serve as an example of what not to do
That’s why I’m wondering if backend failures are the cause of site outage experiences like Rainbow Girl’s, or mine. I’m guessing yes. [Others agree. “As few as 1 in 100 applications on the federal exchange contains enough information to enroll the applicant in a plan, several insurance industry sources told CNBC on Friday. Some of the problems involve how the exchange’s software collects and verifies an applicant’s data.” ] So let’s look at intermittent failures. I should stress that although I’ve done work of this kind, I’ve never done it on this scale; to someone at the top of the systems building power curve, I’m an intelligent layman with good research and critical thinking skills. That said, let’s try to dope out what’s going on, given that the Obama administration has been extraordinarily secretive about this project, so we’re trying to figure out what’s going on inside a black box.
First, intermittent failures are the hardest to track down, since intermittent failures are just that: intermittent. (Here’s a real life story where the guy basically twiddled stuff in his vehicle until the #FAIL went away, rather like an ant pushing a crumb randomly until it falls into a hole in the nest. And this post makes the point that intermittent failures can be, institutionally, a cancer.) Here’s what a testing consultant to software firms has to say about the heuristic of finding intermittent failures. I’m going to highlight two important characteristics of their heuristics: Institutional issues and critical thinking skills . (There are many items that have to do with technical skills and observational skills, but in a multi-institutional project, as ObamaCare, I suggest those come under the heading of institutional skills. ObamaCare is not a corporation of systems with one owner, but a federation of systems with many owners, so getting to a place to perform observations and exercise technical skills isn’t a given; it will take at least one meeting. But heck. They’ve got 88 days. Counting holidays.)
Some General Suggestions for Investigating Intermittent Problems:
- Recheck your most basic assumptions : are you using the computer you think you are using? are you testing what you think you are testing? are you observing what you think you are observing?
- Eyewitness reports leave out a lot of potentially vital information. So listen, but DO NOT BECOME ATTACHED to the claims people make .
- Invite more observers and minds into the investigation.
- Create incentives for people to report intermittent problems.
- If someone tells you what the problem can’t possibly be, consider putting extra attention into those possibilities.
- Check tech support websites for each third party component you use. Maybe the problem is listed.
- Seek tools that could help you observe and control the system.
- Improve communication among observers (especially with observers who are users in the field).
- Establish a central clearinghouse for mystery bugs, so that patterns among them might be easier to spot.
- Look through the bug list for any other bug that seems like the intermittent problem.
- Make more precise observations (consider using measuring instruments).
- Improve testability: Add more logging and scriptable interfaces.
- Control inputs more precisely (including sequences, timing, types, sizes, sources, iterations, combinations).
- Control state more precisely (find ways to return to known states).
- Systematically cover the input and state spaces.
- Save all log files. Someday you’ll want to compare patterns in old logs to patterns in new ones.
- If the problem happens more often in some situations than in others, consider doing a statistical analysis of the variance between input patterns in those situations.
- Consider controlling things that you think probably don’t matter.
- Simplify . Try changing only one variable at a time; try subdividing the system. (helps you understand and isolate problem when it occurs)
- Complexify . Try changing more variables at once; let the state get “dirty”. (helps you make a lottery-type problem happen)
- Inject randomness into states and inputs (possibly by loosening controls) in order to reach states that may not fit your typical usage profile.
- Create background stress (high loads; large data).
- Set a trap for the problem, so that the next time it happens, you’ll learn much more about it.
- Consider reviewing the code .
- Look for interference among components created by different organizations .
- Celebrate and preserve stories about intermittent problems and how they were resolved.
- Systematically consider the conceivable causes of the problem (see below).
- Beware of burning huge time on a small problem. Keep asking , is this problem worth it?
- When all else fails, let the problem sit a while, do something else, and see if it spontaneously recurs.
That’s a lot of critical thinking required. Do you know a lot of institutions that love and value critical thinking and critical thinkers? No? And that’s a lot of institutional savvy required. Do you think that CMS and HHS — even with Obama backing them with all his lameduck political clout — are going to be able to get institutional cooperation from all these players on a crash basis? I doubt it, especially after the Obama public relations machine told everybody “Relax! It’s just a soft launch!” and “The only date that really matters is January 1!”
So what’s the difference? The system architecture. Medicare has a simple and robust single payer architecture: You determine eligibility in one (1) jurisdiction (the United States) with one eligibility criteria for citizens: Their age. ObamaCare, by contrast, needs to determine eligibility in 50 (fifty) jurisdictions [for ObamaCare as a whole; 36 for the Federal Exchanges], with a complex eligibility formula that’s primarily income-based, but involves systems integration from the IRS, DHS, HHS, and private credit reporting companies (at least), to throw people into the right subsidy bucket. That’s called a combinatorial explosion, and even the best program and project managers — which ObamaCare’s managers clearly either are not, or have not been given the opportunity to be — have a hard time dealing with them. Let me know how it all works out….
Every single one of those institutions owns at least one system that needs to be integrated into the Exchanges, and each is a potential source of intermittent #FAIL. And remember, if one component fails, the system fails. Intermittently. Suppose every applicant on the Exchange has to be checked to see if they’re a Native American; fact, they do. It doesn’t matter if DHS give back its results slick as a whistle if the Bureau of Indian Affairs has intermittent failures; or vice versa. So technical issues aren’t the main story here; the truly challenging issues are never technical. Rather, we have management issues:
- Political: Wrangling IRS, DHS, HHS, private credit reporting companies, and IIRC the Bureau of Indian Affairs, and the Peace Corps;
- Personnel: Finding critical thinkers and deploying them, in whatever institutional settings they are to be found.
Now, in a perfect world, there are just a few bugs, and by Monday, the Federal Exchange will be up and running, and
sticking out your arm for health insurance parasites to sink their tiny, rent-extracting mandibles into buying health insurance online will be as easy as buying a flat-screen teebee or a plane ticket. I’m guessing no, because that greybeard’s phrase, “intermittent failure” should give you the heebie jeebies, as it did me. You don’t fix that stuff in a weekend.
So, Kremlinology: If Obama appoints a Czar to fix the Federal Exchanges, we’re in “buff the turd” mode. That will mean the White House has determined that political and personnel problems with the Federal Exchanges are insoluble.
NOTE A word on “the government can’t write software.” The government can write brilliant software: How do people think the Mars Landers are run? Tin cans and string? The question that should be asked is: Why was the government tasked with writing this software, when the rugged, robust, simple and proven single payer architecture was available to them? Jon Benteley: “The cheapest, fastest, and most reliable components are those that aren’t there.” Throw the requirement to preserve rental extraction by health insurance agencies out of the equation, and everything becomes clean. And implementable. Could have been done in a year, as LBJ rolled out Medicare in a year. Whoever wrote the requirements for this thing should be publicly shamed.
NOTE I don’t have the time to add this part, but check out this link for the cultural differences between frontend and backend developers. Note that the famous Reddit thread is frontend developers. More management challenges! And here’s a piece of what turns out retrospectively to have been frontend-centric hagiography back in June.
NOTE  Back in the day when I was a fancy pants consultant, I would hear people say “I don’t want my software built by people who don’t have kids.” Unfair perhaps — FaceBook, Google — but those projects often had lives at stake. As does ObamaCare. I guess, to avoid offensive — justly offensive — age-ism, I’d say better to have old heads on young shoulders. Developers like that do exist, fortunately. Oh, there’s a picture of the author at the link. Yes, he has a grey beard. He knows the Great Runes!
NOTE  Frontend vs. backend from Stack Overflow:
Generally speaking, the “front end” is the user interface, and the “back end” is the code supporting that front end (responsible for database access, business logic etc). …
Frontend is what you do that the user can see. Like designing a user interface. Backend programming is the code that the user doesn’t ‘see’. This is what works with the data behind the scenes. For example fetching/inserting/deleting/updating a database. …
The Federal Exchanges work using the exactly the same system architecture, although on a much bigger scale. Simplifying horribly: The process of setting up an account, for example: You enter data on the front end in a form. You press submit. Your data #1 goes over the Intertubes, and the server grabs it and stores it in a database on the backend, and #2 the backend sends back a confirmation to the browser, again through the server over the Intertubes, that your data has been stored and that it’s OK for the browser to go ahead to the next screen. Imagine a world where the front-end developers had created a frontend of a Jony Ives-like slickness. The backend has at least two points of failure at #1 and #2. Suppose at #1 the database is down. That would look like a failure of the front end to respond (“It’s hung up!”) but it’s really a backend thing. Suppose the server at #2 died, as servers will do. Again, the frontend would look like it died, but it’s just helplessly waiting for the confirmation code, which hasn’t arrived. The intermittent failures could happen either at #1 or #2.
Somebody really technically competent who does this for a living would laugh at the paragraph above, since the blog you are reading is basically a Wright Brothers Flyer compared to the Federal Exchanges, which are more like a Boeing 747 (or perhaps a Spruce Goose). Nevertheless, both sites and both airplanes share a fundamental architecture and fundamental engineering principles, which you must understand to cut through the bullshit and the bafflegab of what’s going on with the Exchanges, technically.
NOTE  Remember, it’s highly likely that the White House totally borked the entire project by changing the forms — that is, the stuff you fill in at the Exchange website on the front end — at the last minute for political and public relations reasons. Speculating freely: If the ObamaCare Exchange’s front end, where the form is filled out, was tightly coupled to the back end — that is, if the backend expected to find some chunk of data in exactly one place in the form, from where it would grab it and slam it directly into the database on the backend, that’s quite likely. That’s the quick and dirty way a body shop under pressure would do the job, and we all know the Obama administration would never cut corners on something that didn’t directly impact public relations, right? Ha ha. Anyhow, we do know that changing the forms broke the Connecticut, state-based Exchange, so it’s reasonable to think that the Feds had the same problem. Except for 36 states, not one. And that means that Obama is going to people he screwed over to ask them for a shovel to dig himself out of a jam. To technical people, this is standard operating procedure, but the managers and department heads may not be so compassionate.
NOTE  “It’s all gone political, sir.” Terry Pratchett, Thud.
NOTE  I missed the obvious one: Privatize it all. The market can never fail. It can only be failed. If Obama actually managed this process so that privatization of the Federal Exchanges is the outcome, my hat is off to him; he’s either lucky as only the truly evil can be, or he really can play 11-dimensional chess.
UPDATE To be fair, there is one (1) ObamaCare website that launched on Day One with zero problems, zero defects. Here it is.