The aftermath of two crashes of Boeing 737 Max jets shortly after takeoff has led to the global grounding of the airplane. Boeing has been forced to cut production, and even so, undelivered planes are piling up. Big buyers like Southwest American Airlines have been forced to cancel flights during their peak time of year as a result of taking their 737s off line. American lengthened its 737 grounding to June 5 and Southwest, to August 5 [Update: American sent a notice to American Aadvantage members that the grounding would last through August 19].
Even though Boeing is scrambling to fix the software meant to counter the 737 Max’s increased propensity to stall as a result of the placement of larger, more fuel=efficient engines in a way that reduced the stability of the plane in flight, it’s not clear that this will be adequate in terms of flight safety or the public perception of the plane. And even though the FAA is almost certain to sign off on Boeing’s patch, foreign regulators may not be so forgiving. The divergence we’ve seen between the FAA and other national authorities is likely to intensify. Recall that China grounded the 737 Max before the FAA. In another vote of no confidence, even as Boeing was touting that its changes to its now infamous MCAS software, designed to compensate for safety risks introduced by the placement of the engines on the 737 Max, the Canadian air regulator said he wanted 737 Max pilots to have flight simulator training, contrary to the manufacturer’s assertion that it isn’t necessary. Last week, the Wall Street Journal reported that American Airlines is developing 737 Max flight simulator training.
But a fundamental question remains: can improved software compensate for hardware shortcomings? Some experts harbor doubts. For instance, from the Spokane Spokesman-Review:
“One of the problems we have with the system is, why put a system like that on an airplane in the first place?” said Slack, who doesn’t represent any survivors of either the Lion Air or Ethiopia Airlines crashes. “I think what we’re going to find is that because of changes from the (Boeing 737) 800 series to the MAX series, there are dramatic changes in which they put in controls without native pitch stability. It goes to the basic DNA of the airplane. It may not be fixable.”
“It is within the realm of possibility that, if much of the basic pitch stability performance of the plane cannot be addressed by a software fix, a redesign may be required and the MAX might not ever fly,” [aviation attorney and former NASA aerospace engineer Mike] Slack said.
An even more damming take comes in How the Boeing 737 Max Disaster Looks to a Software Developer in IEEE Spectrum (hat tip Marshall Auerback). Author Greg Travis has been a software developer for 40 years and a pilot. He does a terrific job of explaining the engineering and business considerations that drove the 737 Max design. He describes why the plane’s design is unsound and why the software patch in the form of MCAS was inadequate, and an improved version is unlikely to be able to compensate for the plane’s deficiencies.
Even for those who have been following the 737 Max story, this article has background that is likely to be new. For instance, to a large degree, pilots do not fly commercial aircraft. Pilots send instructions to computer systems that fly these planes. Travis explains early on that the As Travis explains:
In the 737 Max, like most modern airliners and most modern cars, everything is monitored by computer, if not directly controlled by computer. In many cases, there are no actual mechanical connections (cables, push tubes, hydraulic lines) between the pilot’s controls and the things on the wings, rudder, and so forth that actually make the plane move…..
But it’s also important that the pilots get physical feedback about what is going on. In the old days, when cables connected the pilot’s controls to the flying surfaces, you had to pull up, hard, if the airplane was trimmed to descend. You had to push, hard, if the airplane was trimmed to ascend. With computer oversight there is a loss of natural sense in the controls….There is only an artificial feel, a feeling that the computer wants the pilots to feel. And sometimes, it doesn’t feel so great.
Travis also explains why the 737 Max’s engine location made the plane dangerously unstable:
Pitch changes with power changes are common in aircraft. Even my little Cessna pitches up a bit when power is applied. Pilots train for this problem and are used to it. Nevertheless, there are limits to what safety regulators will allow and to what pilots will put up with.
Pitch changes with increasing angle of attack, however, are quite another thing. An airplane approaching an aerodynamic stall cannot, under any circumstances, have a tendency to go further into the stall. This is called “dynamic instability,” and the only airplanes that exhibit that characteristic—fighter jets—are also fitted with ejection seats.
Everyone in the aviation community wants an airplane that flies as simply and as naturally as possible. That means that conditions should not change markedly, there should be no significant roll, no significant pitch change, no nothing when the pilot is adding power, lowering the flaps, or extending the landing gear.
The airframe, the hardware, should get it right the first time and not need a lot of added bells and whistles to fly predictably. This has been an aviation canon from the day the Wright brothers first flew at Kitty Hawk.
Travis explains in detail why the MCAS approach to monitoring the angle of attack was greatly inferior to older methods….including having the pilots look out the window. And here’s what happens when MCAS goes wrong:
When the flight computer trims the airplane to descend, because the MCAS system thinks it’s about to stall, a set of motors and jacks push the pilot’s control columns forward. It turns out that the flight management computer can put a lot of force into that column—indeed, so much force that a human pilot can quickly become exhausted trying to pull the column back, trying to tell the computer that this really, really should not be happening.
Indeed, not letting the pilot regain control by pulling back on the column was an explicit design decision. Because if the pilots could pull up the nose when MCAS said it should go down, why have MCAS at all?
MCAS is implemented in the flight management computer, even at times when the autopilot is turned off, when the pilots think they are flying the plane. In a fight between the flight management computer and human pilots over who is in charge, the computer will bite humans until they give up and (literally) die…
Like someone with narcissistic personality disorder, MCAS gaslights the pilots. And it turns out badly for everyone. “Raise the nose, HAL.” “I’m sorry, Dave, I’m afraid I can’t do that.”
Travis also describes the bad business incentives that led Boeing to conceptualize and present the 737 Max as just a tweak of an existing design, as opposed to being so areodynamically different as to be a new plane….and require time-consuming and costly recertification. To succeed in that obfuscation, Boeing had to underplay the existence and role of the MCAS system:
The necessity to insist that the 737 Max was no different in flying characteristics, no different in systems, from any other 737 was the key to the 737 Max’s fleet fungibility. That’s probably also the reason why the documentation about the MCAS system was kept on the down-low.
Put in a change with too much visibility, particularly a change to the aircraft’s operating handbook or to pilot training, and someone—probably a pilot—would have piped up and said, “Hey. This doesn’t look like a 737 anymore.”
To drive the point home, Travis contrasts the documentation related to MCAS with documentation Cessna provided with an upgrade to its digital autopilot, particularly warnings. The difference is dramatic and it shouldn’t be. He concludes:
In my Cessna, humans still win a battle of the wills every time. That used to be a design philosophy of every Boeing aircraft, as well, and one they used against their archrival Airbus, which had a different philosophy. But it seems that with the 737 Max, Boeing has changed philosophies about human/machine interaction as quietly as they’ve changed their aircraft operating manuals.
Travis also explains why the FAA allows for what amounts to self-certification. This practice didn’t result from the usual deregulation pressures, but from the FAA being unable to keep technical experts from being bid away by private sector players. Moreover, the industry has such a strong safety culture (airplanes falling out of the sky are bad for business) that the accommodation didn’t seem risky. But it is now:
So Boeing produced a dynamically unstable airframe, the 737 Max. That is big strike No. 1. Boeing then tried to mask the 737’s dynamic instability with a software system. Big strike No. 2. Finally, the software relied on systems known for their propensity to fail (angle-of-attack indicators) and did not appear to include even rudimentary provisions to cross-check the outputs of the angle-of-attack sensor against other sensors, or even the other angle-of-attack sensor. Big strike No. 3.
None of the above should have passed muster. None of the above should have passed the “OK” pencil of the most junior engineering staff, much less a DER [FAA Designated Engineering Representative].
That’s not a big strike. That’s a political, social, economic, and technical sin….
The 737 Max saga teaches us not only about the limits of technology and the risks of complexity, it teaches us about our real priorities. Today, safety doesn’t come first—money comes first, and safety’s only utility in that regard is in helping to keep the money coming. The problem is getting worse because our devices are increasingly dominated by something that’s all too easy to manipulate: software…
I believe the relative ease—not to mention the lack of tangible cost—of software updates has created a cultural laziness within the software engineering community. Moreover, because more and more of the hardware that we create is monitored and controlled by software, that cultural laziness is now creeping into hardware engineering—like building airliners. Less thought is now given to getting a design correct and simple up front because it’s so easy to fix what you didn’t get right later….
It is likely that MCAS, originally added in the spirit of increasing safety, has now killed more people than it could have ever saved. It doesn’t need to be “fixed” with more complexity, more software. It needs to be removed altogether.
There’s a lot more in this meaty piece. Be sure to read it in full.
And if crapification by software has undermined the once-vanuted airline safety culture, why should we hold out hope for any better with self-driving cars?