Premature optimization is the root of all evil. –Donald Knuth
By Lambert Strether of Corrente.
Far be it from me to take issue with technical publishing guru and venture capitalist Tim O’Reilly; not only do I have a shelf or two of his great books, he puts cute animals on the covers! That said, I think O’Reilly’s “algorithmic regulation” is a horrible idea — absolutely Silicon Valley nutball triumphalism at its cringularity-inducing worst — and I want to explain why, as best I can with only an evening available. First, I’m going take a 30,000 foot view of “algorithmic regulation” and code and data; then I’m going drill down and look at specifics: St. Louis County, as it happens.
Algorithmic Regulation, Data, and Code
Here’s how O’Reilly (skipping the sharing economy uber-triumphalism) describes the relation between law and “algorithmic regulation” (apparently his coinage):
Regulation is the bugaboo of today’s politics.
Laws should specify goals, rights, outcomes, authorities, and limits. If specified broadly, those laws can stand the test of time.
Regulations, which specify how to execute those laws in much more detail, should be regarded in much the same way that programmers regard their code and algorithms, that is, as a constantly updated toolset to achieve the outcomes specified in the laws….
Increasingly, our interactions with businesses, government, and the built environment are becoming digital, and thus amenable to creative forms of measurement [one hopes not “creative” in the way creative accounting is creative], and ultimately algorithmic regulation….
There are many good examples of data collection, measurement, analysis, and decision-making taking hold in government. In New York City, data mining was used to identify correlations between illegal apartment conversions and increased risk of fires, leading to a unique cooperation between building and fire inspectors.
It’s important to understand that these manual interventions are only an essential first step. .
A deep understanding of the desired outcome
Real-time measurement to determine if that outcome is being achieved
Algorithms (i.e. a set of rules) that make adjustments based on new data
Periodic, deeper analysis of whether the algorithms themselves are correct and performing as expected.
Open data plays a key role in both steps 2 and 4. Open data, either provided by the government itself, or required by government of the private sector, is a key enabler of the measurement revolution. Open data also helps us to understand whether we are achieving our desired objectives, and potentially allows for competition in better ways to achieve those objectives.
Before looking at O’Reilly’s four points, I’d suggest that “automating those interventions” translates into “code is law,” where law is the subject matter of the legal system, both statutes and rules and regulations. Naked Capitalism has been covering the metastasis of “code is law” for some time, and for those who came in late, I’d like to review the work:
Yves, inimitably, defined “code is law” as follows (June 2014):
[C]ode is law, the notion that underlying, well-established procedures and practices are revised to conform to the dictates of computer systems, with result being crapification of the activity.
In May 2012 (O’Reilly’s piece was published in October 2013), I’d introduced the concept (“Code is Law: Literally”) after looking at Judge Elizabeth W. Magner’s decision in a MERS case, In Re Jones and said what Yves just said in longer words:
In summary, I’ve suggested there are two ways to look the foreclosure crisis:
- As a law enforcement problem, where banksters have committed illegal acts;
- and as a jurisprudence problem, where : Code is law, not metaphorically, but literally.
In the latter case, Statutes, rules, regulations become vestigial. Code is the driver. (I can’t think another word for this than “revolutionary,” even if a revolution is “an overthrow or repudiation and the thorough replacement of an established government or political system by the people governed,” and not the people doing the governing.)
In September 2012 (“Code is law” Once More”) I’d suggested that this “revolution” was coming to pass. The Tjosaas family were foreclosed on not once but twice because of software problems. Their home and all their possessions were stolen, literally; Wells Fargo contractors drove up, trucked everything away, and “secured” the house. In each case, Wells Fargo’s IT system had given the contractors the wrong address. There were no criminal penalties or even charges. I concluded:
if we accept that “code is law,” then false positives from corporate computer systems are legal, just as they are in the copyright case above. Again, I’d argue that we don’t have a law enforcement issue, but a jurisprudential issue; the nature of law, and hence rule, has changed, as part of a larger change in the constitutional order that is still underway.
So now that we’ve seen actual examples of “algorithmic regulation” during the foreclosure crisis and the seizure of the Tjosaas family’s home, let’s go back to O’Reilly’s four point description of a “successful algorithmic regulation system” (to review). First, let’s take notice of a crucial omission. O’Reilly writes:
Open data plays a key role in both steps 2 and 4.
Remarkably, O’Reilly does not require “open source” code (that is, the software or program that works on the data); only data. False positives like the Tjosaas family experienced are inevitable in any system of “algorithmic regulation, ” since “testing can show the presence of bugs but never their absence.” But if “algorithmic regulation” code, unlike data, can be proprietary and secret — rather in the way that the chemical composition of tracking fluid can be kept secret — how are coders and their employers to be held accountable for bugs and induced to fix them? Will bugs even be tracked? Or is the occasional random loss of a home an acceptable price to pay for an algocratic utopia? (See The Economist on how open source makes medical software less buggy.) Second, remember that O’Reilly is advocating “automating those interventions,” and by “interventions” he most definitely means what we used to call “law enforcement,” because he writes about enforcing speed limits. So, you could be to all intents and purposes “arrested” by software, but if you believe — as the Tjosaas family believed, correctly — that the software is wrong — buggy, as all software is buggy — then how do you appeal? In the case of proprietary, closed source software, there is no appeal (at least in O’Reilly’s “successful” system) because you can’t see the software, so you can’t find the bugs. Finally, O’Reilly’s work appears in an anthology called “Beyond Transparency,” and it’s worth underlining heavily how untransparent, how opaque, even obfuscatory, software programs are. Harold Abelson and Gerald Jay Sussman, with Julie Sussman, Structure and Interpretation of Computer Programs, p. 29:
Computational processes are abstract beings that inhabit computers. As they evolve, processes manipulate other abstract things called data. The evolution of a process is directed by a pattern of rules called a program. People create programs to direct processes. In effect, we conjure the spirits of the computer with our spells.
A computational process is indeed much like a sorcerer’s idea of a spirit. It is not composed of matter at all. However, it is very real. It can perform intellectual work. It can answer questions. It can affect the world by disbursing money at a bank or by controlling a robot arm in a factory. The programs we use to conjure processes are like a sorcerer’s spells. They are carefully composed from symbolic expressions in arcane and esoteric programming languages that prescribe the tasks we want our processes to perform.
Above, we focused on the code itself; the “symbolic expressions in arcane and esoteric programming languages.” In passing, we’d note that programming languages continually evolve, so O’Reilly’s “successful” system, whatever else it may be, is a permanent jobs program for programmers, either to maintain programs in old (arcane and esoteric) languages — like when all the grey-hairs were called in off the golf courses to fix ’70s and ’80s COBOL programs in the Y2K effort — or to rewrite them in new ones. No that there’s anything wrong with that, but let’s be clear that’s what’s going on.
More importantly, “algorithmic regulations” are no longer accessible to citizens. The rules and regulations of the law we understood the law Civics 101 might be complicated, but at least one could read the text, and lawyers and advocates could cite to it. But with “algorithmic regulation,” not only is the code that drives the automated intervention “arcane and esoteric” — if you think the Code of Federal Regulations is bad, try obfuscated Perl — and the very operation of the regulation is “much like a sorcerer’s idea of a spirit.” How in good conscience can we ask a free people to obey regulations that they cannot read, when they cannot verify how the regulation was applied in their case, and when they have no way for appeal the “automatic intervention”?
From code in points 1 and 3, we now move to data in points 2 and 4 (and this will be simpler, I promise). First, “data scientists” are finally discovering that good data is hard to find; there’s a lot of labor in it. The Times:
“Data wrangling is a huge — and surprisingly so — part of the job,” said Monica Rogati, vice president for data science at Jawbone, whose sensor-filled wristband and software track activity, sleep and food consumption, and suggest dietary and health tips based on the numbers. “It’s something that is not appreciated by data civilians. At times, it feels like everything we do.”
Several start-ups are trying to break through these big data bottlenecks by developing software to automate the gathering, cleaning and organizing of disparate data, which is plentiful but messy. The modern Wild West of data needs to be tamed somewhat so it can be recognized and exploited by a computer program.
“It’s an absolute myth that you can send an algorithm over raw data and have insights pop up,” said Jeffrey Heer, a professor of computer science at the University of Washington and a co-founder of Trifacta, a start-up based in San Francisco.
In other words, we might look at O’Reilly’s piece more a sample of “Investor Storytime” for data wrangling startups in Silicon Valley, rather than as a serious public policy proposal, assuming the word “public” has meaning any more; open or not, the data doesn’t support what the techno-visionaries want to do with it. Further, given that law enforcement (“automated intervention”) is now data driven, malefactors of great wealth have every incentive to game the data they make available:
A new report charges that several oil and gas companies have been illegally using diesel fuel in their hydraulic fracturing operations, and then doctoring records to hide violations of the federal Safe Drinking Water Act.
The report, published this week by the Environmental Integrity Project, found that between 2010 and July 2014 at least 351 wells were fracked by 33 different companies using diesel fuels without a permit. The Integrity Project, an environmental organization based in Washington, D.C., said it used the industry-backed database, FracFocus, to identify violations and to determine the records had been retroactively amended by the companies to erase the evidence.
“Open” or not, how exactly does “algorithmic regulation” detect data that’s been doctored at source? Particularly when, as the Times points out, we’ve got great difficulty just cleaning the data to begin with?
So those are some of the problems with O’Reilly’s four points. But now let’s look at the real problem. It’s buried right in the first sentence:
[A] successful algorithmic regulation …
But how does O’Reilly know what the boundaries of the system really are?
Premature Optimization and the St. Louis County
As everybody and their grandmother know, the engineers at Google have been developing a driverless car. BBC:
Google’s self-driving cars are programmed to exceed speed limits by up to 10mph (16km/h), according to the project’s lead software engineer.
Dmitri Dolgov told Reuters that when surrounding vehicles were breaking the speed limit, going more slowly could actually present a danger, and the Google car would accelerate to keep up.
Note that Dmitri is squarely in the mainstream of O’Reilly’s revolutionary constitutional order; it just isn’t a problem that he “breaks” the speed limit; his “automatic intervention” doesn’t override the law; it is the law. Dmitri’s problem is that he doesn’t understand the boundaries of his “system,” and so he’s prematurely optimizing it. We can see this very clearly by asking a simple question:
What about speed traps?
Let’s take St. Louis County — home of Ferguson — as an example:
Residents in the St. Louis suburbs didn’t believe the initial police account because many of them have had experiences with cops that were not happy ones. The area is riddled with speed-trap towns that collect half of their general revenue from traffic tickets. Instead of friendly lawmen, they see revenue agents with attitude and big guns. Police weaponry is increasingly powerful.
They see policemen as “revenue agents” because that’s exactly what they are. Here’s another example from the de-industrialized Midwest, in Cleveland:
More often than not, a tiny community—a village, an impoverished town, a designated special services district endowed with a certain degree of autonomy—will harness whatever police power it has and turn it into a source of revenue. I’ve explored this trend in East Cleveland, a impoverished inner-ring suburb of Ohio’s largest metro, which has struggled to raise revenue after decades of watching its middle class tax base dwindle to nothing.
Most law enforcement supports the cameras; so do private citizens who have lost family members to other people’s reckless driving. Meanwhile, the opponents of these cameras nearly always recalled the apparent history of moneymaking schemes; one Democratic representative evoked Elmwood Place as evidence of corruption, specifically referencing how 40% of the proceeds collected for tickets go directly to an out-of-state private company whose primary profit motive encourages it to issue as many tickets as possible.
So back to St. Louis. Why the speed traps? Well, they’re the outcome of a complicated history that has left St. Louis County very fragmented:
The map of St. Louis County, the home of Ferguson, looks like a shattered pot. It’s broken into 91 municipalities that range from small to tiny, along with clots of population in unincorporated areas. Dating as far back as the 19th century, communities set themselves up as municipalities to capture control of tax revenue from local businesses, to avoid paying taxes to support poorer neighbors, or to exclude blacks. …
The result of fragmentation today is a county whose small towns are highly stratified by both race and income. As blacks move into a town, whites move out. The tax base shrinks, and blacks feel cheated that the amenities they came for quickly disappear, says Clarence Lang, a University of Kansas historian who has studied St. Louis. Ferguson flipped from majority white to majority black so quickly that the complexion of the government and police force doesn’t match that of the population. That mismatch was a key factor in the tense race relations that contributed to the riots and, perhaps, the shooting itself.
That’s not all. Businesses choosing where to locate can play the tiny municipalities off against one another for tax incentives, prompting a race to the bottom that robs them all of .
There’s widespread recognition that fragmentation is holding back the economic development of greater St. Louis, but once a municipality is formed, however small, it’s exceedingly difficult to merge out of existence. Ferguson is comparatively populous at about 21,000 people. Many of St. Louis County’s postage-stamp municipalities have fewer than 1,000 people. Champ may be the smallness champ, with a 2010 population of 13, all white.
Now, the technical answer is to find the locations of the speed traps (they’re crowd-sourced), take the driverless car’s GPS location as input, and have the software slow it down when approaching known traps. Problem solved! But not so fast. Of course, the cops will game the speed trap locations, and speeds, but the real issue is that, like it or not, the fines from speed traps are important sources of revenue for poor and black communities like Ferguson. Does Google really want to be responsible for further impoverishing (“cheating”) them — and for more shootings? Suddenly the clean boundaries of Dmitri’s system have expanded beyond the car, beyond the road, beyond the municipality, and to an entire urban area; in this case, St. Louis County, and a lot more people are involved than the drivers. Things have gotten messy. Things are no longer like Mountain View. The scope of the project is no longer clear. And since scope is the key to project success, that’s a big, big problem.
Problems with no clear boundaries are often called “wicked problems”. There’s a whole literature on this topic, but this description gives the flavor:
“Wicked” problems,for which there is (Rittel and Webber 1973), are unfortunately widespread in the management of social-ecological systems (Chapin et al. 2008, Jentoft and Chuenpagdee 2009, Peterson 2009)….In the face of wicked problems, when , discourse structured by coalitions, ideology, and social practices can take on a central role in defining policy choices …. As a result, power relations, emerging from formal institutional arrangements and informal network structures, can have an important influence on the way wicked problems of sustainability are framed and responses are defined (Chatterton and Style 2001).
In short form, we have politics to handle wicked problems; they cannot be handled algorithmically. This example is from fisheries (a common pool resource) but can be generalized:
Inspired by Rittel and Webber [Dilemmas in a general theory of planning. Policy Sciences 1973;4:155-69], it is argued that fisheries and coastal governance is confronted with problems that are inherently “wicked.” Problems are wicked (as opposed to “tame”) when they are difficult to define and delineate from other and bigger problems and when they are not solved once and for all but tend to reappear. Instead, for wicked problems governance must rely on the collective judgment of stakeholders [for example, the people of Ferguson] involved in a process that is experiential, interactive and deliberative.
Here’s a list of wicked problems that cannot be solved via “algorithmic regulation,” including common pool resources and related systems:
In recent years, Ostrom and her colleagues have wrestled with how the IAD framework and the eight design principles can be “scaled-up” and applied perhaps in modified form to global common-pool resources and related “super wicked problems” such as and other global commons dilemmas.
That’s a lot of problems! (And in this regard, it’s very telling that O’Reilly, when questioned on how “algorithmic regulation” would handle wicked problems, evaded answering.) And since I have succeeded in moving what O’Reilly and Google seem to regard as a tractable problem (driverless cars) into the wicked problems space, I would like very much to know how many other seemingly tractable problems meant to be solved by “algorithmic regulation” and “automated intervention” are in fact wicked. My guess is 80% and rising, although that’s not at all helpful from “Investor Storytime” standpoint.
Since it’s late, I’ll summarize briefly: I believe that “algorithmic regulation” is subject to bad and gamed data, accepts closed source code, and isn’t suitable for “automated intervention,” let alone the government of a free people. I also believe it presents problems as tractable that are in fact wicked, leading to project failure. Algorithmic regulation — from the standpoint of the social systems in which it is embedded — is prematurely optimized. It is, as I said, a horrible idea.
 Not it’s not. Class warfare is the bugaboo haunting today’s politics. See this essay from Kareem Abdul-Jabbar. Regulation is a bugaboo haunting venture capitalists like O’Reilly. I know O’Reilly knows who Kareem is, but I sure hope kids these days do….
 Any time you hear “deep understanding” from a Silicon Valley venture capitalist like O’Reillly, check your bogometer; if it’s not pinned, recalibrate. Ditto for “innovative” and “disruptive.”
 Lambert blushes modestly.
 I say “remarkably” because many if not most of O’Reilly’s books support open source programs, and the languages typically used to write such programs.
 I suppose one could sue the software developer over an “automated intervention” in the form of a traffic violation, so theoretically there’s a judicial back door to O’Reilly’s “successful” system. But it’s hard enough to open that back door with human cops. How hard will it be with algorithms? I’d argue it’s so hard that it’s simpler to swallow hard and pay up, much as you would do with a cop in a corrupt, third-world country. Note, of course, that this dynamic actually incentives false positives, in cases where there are fines for violations of “algorithmic regulations”; “bug traps,” rather than “speed traps,” you might say.
 A permanent jobs program for programmers might be preferable to a permanent jobs program for lawyers. I’ve dealt with both; it’s a tough call.
 “But” — I hear you cry — “we will document the algorithm carefully!” But one category of bugs is a mismatch between code and documentation; it happens all the time, even in O’Reilly books. So which is controlling? The algorithm as implemented in software, or the documentation that describes the algorithm in human terms? If the documentation, aren’t we right back in the world of “manual intervention” that O’Reilly is so anxious to escape? And if not, haven’t we simply replaced “pointy headed bureaucrats” with “pointy haired bosses”?
 Here we are reminded of the $200 billion in title fees stolen from municipalities through MERS, when “automated intervention” nuked a land title system that had been “successful” for centuries.
 That would be why Google, unlike the people of Ferguson, maintains a presence on K Street, in Washington, DC.
UPDATE I just gave the post a light copy edit; “It was hard to write, so it should be hard to read.” For example, I meant to write “Silicon Valley nutball triumphalism,” not “Silicon Valley nutgall triumphalism.” Just goes to show that spell-checking is a flawed concept. Sorry.