..................................................................................................................................................................................... .....................................................................................................................................................................................
Showing posts with label Dubious statistics. Show all posts
Showing posts with label Dubious statistics. Show all posts

Thursday, August 28, 2008

GDP Release Signals Further Decline into Banana Republic Status

Last year, we put America on Banana Republic watch, and sadly, things appear to be playing out as we feared:
I'm certain you're familiar with the expression "death wish." I am beginning to wonder whether America has a banana republic wish. The country has been taking steps towards being a small-minded, elite-dominated, sham democracy.

Mind you, I am pointing to a tendency, not an established fact. The US isn't Haiti, or even Argentina. But we are moving in that direction on a variety of fronts, and the devolution seems so concerted that I wonder if there is some unconscious mass desire to give up on the messiness and ambiguity of an open society and surrender to the certainty of one with institutionalized inequality, more authoritarianism, but more predictability, and perhaps an illusion of greater security.

What triggered this line of thought? Something surprisingly minor: the April employment report,...But even this disappointing figure may have been the product of manipulation, as we will discuss in due course. And we've now had so many instances of what charitably may be called artful reporting that it's beginning to undermine my faith in government statistics. Unreliable government statistics are a Banana Republic Indicator..... the integrity of that data is becoming compromised on enough fronts so as to render them suspect. And inaccurate data leads to bad business and bad policy decisions. Bad policy decisions are particularly likely since the information is massaged so as to minimize unpleasant news.

What is remarkable is that today's 2Q GDP revision. from a 1.9% that most observers regard as likely to be revised downward (and initial releases are often revised by significant increments), has now been revised to a simply not credible 3.3%. We'll discuss in a bit how this artwork was achieved.

Yet what is more remarkable is that a quick read of the MSM (Bloomberg, Financial Times, the Wall Street Journal, and the New York Times) reveals that no source seems willing to challenge this practice and call it for what it is, manipulation for political purposes. Some economists quoted by the MSM instead politely chose to ignore the dead body in the room and argue, essentially, that this supposed data point was irrelevant as far as the outlook was concerned. Here we see some tiptoeing around the tulips quotes:
Bloomberg: ``Outside of trade, the economy is considerably weaker,'' said Carl Riccadonna, an economist at Deutsche Bank Securities Inc. in New York. ``When you look at the spending, it looks terrible for the second half of the year.''

Reuters: "This number seems to overstate the underlying strength even though exports are obviously strong," said James O'Sullivan, an economist at UBS Securities in Stamford, Connecticut.

Now of course, there is good reason for less than a full-bore assault. One is that by the time someone made all the Freedom of Information Act filings to get enough of the supporting work to prove this number was massaged, we'll be not just into the next Adminstration, but into the recovery. Second, economists are supposed to be sober and analytical. Stirring controversy is not part of their job description.

Nevertheless, there were quarters in which doubts were expressed more strongly. Zubin Jelveh at Portfolio provided this quote:
RDQ Economics: "The strength of the economy in the second quarter suggested by the expenditure estimate of real GDP growth seems truly bizarre and is a product of a declining real trade gap."

Bloggers, needless to say, were less inhibited, with Barry Ritholtz, long on the bogus statistics beat, leading the charge:
GDP is out, ticking higher to 3.3% rather than 2.7%

And if you believe that data, I also have a bridge for sale in Brooklyn.

Why the beat on the headline figure? Aside from the usual inflation nonsense, there were two other factors: Exports, which rose to 13.2% (versus earlier reported 9.2%) and Inventories, which also played a part in the apparent strength.

My fishing buddy John Silvia of Wachovia put it into context:

"The overwhelming story is that the export numbers have offset this domestic weakness in consumer spending and business investment. We have a domestic recession.''

Also worth noting: larger than earlier reported gains in every single government expenditure category. If you are wondering why the government does not know what it is actually spending in near real time, welcome to the club.

That boldface was mine. If that isn't sus, I don't know what is.

Barry in a later post, with the help of a chart provided by Michael Panzner, found the real smoking gun: a laughable assumption for inflation. The lower the inflation assumption, the higher the GDP figure. Not only was the 1.2% chosen lower than CPI, which has been adjusted over time to underreport inflation so to reduce payouts on CPI-indexed programs, most notably Social Security, but as a commentor on Econompics noted, constituted the biggest gap between the GDP deflator and CPI since 1980 (squinting at the chart, that seems to be accurate):



Mind you, this massaging is taking place on top of long-running adjustments that make both GDP and inflation stats questionable. Is it time to revive the 1960s expression "credibility gap"?

Refreshingly, some in the MSM are coming close to doing so. This story in Bloomberg, "Lagging Incomes Signal U.S. Economy Weaker Than GDP Suggests," which came out within hours of the release, discusses the disparity between incomes data and GDP without taking on the GDP report frontally. That's a step in the right direction.

From Bloomberg:
The meager gains in earnings over the last year signal the U.S. economy is in much deeper trouble than the growth estimates indicate, economists said.

Gross domestic income, or the money earned by the people, businesses and government agencies whose purchases go into calculating gross domestic product, rose 0.3 percent in the 12 months ended in June after adjusting for inflation, according to Bloomberg calculations based on today's Commerce Department growth report. GDP expanded 2.2 percent.

``The income side of the economy, with profits down for four straight quarters and employment falling, looks like a recession,'' said John Ryding, chief economist at RDQ Economics in New York.

Incomes last quarter grew 1.9 percent at an annual rate after adjusting for inflation, a little more than half the 3.3 percent gain posted by GDP, according to Bloomberg calculations. The figures showed incomes dropped in each of the prior two quarters.

``What you are seeing is more legitimate economic weakness in the income numbers,'' said James O'Sullivan, a senior economist at UBS Securities LLC in Stamford, Connecticut. ``The GDI numbers raise the potential that GDP is overstating growth.''

The 1.9 percentage-point difference between the GDI and GDP over the last 12 months is the biggest in the post World War II era.....

The income numbers are more in line with other figures that indicate the economy struggled from April through June. The jobless rate was 5.5 percent in June, up from 5.1 percent at the end of the first quarter, and employers cut 165,000 workers from payrolls, according to the Labor Department.

``I'm looking at the labor market, and the GDP income numbers make more sense,'' said Ryding. ``It certainly did not feel like 3.3 percent growth.''

The earnings data may more accurately predict the start of economic contractions, according to researchers at the Federal Reserve.

Income adjusted for inflation ``has done a better job recognizing the start of recessions than has the growth rate of real GDP,'' Jeremy J. Nalewaik, a Fed economist wrote in a December 2006 report. ``Placing an increased focus on GDI may be useful in assessing the current state of the economy.''

While the income and growth figures should theoretically match, the different methods used in calculating the numbers prevent them from converging fully.

Wednesday, July 23, 2008

CBO Comes Close to Saying It Made Up That $25 Billion Freddie, Fannie Rescue Cost Estimate

Readers may have seen that we cast aspersions on the CBO's estimate that the Fannie and Freddie rescue program would "probably" cost taxpayers $25 billion. We had noted that the estimate was only through 2009 because that's how far the authorization extends, but there is no way that Fannie and Freddie will ever be cut loose. Thus an estimate the looked at the liability that was really being taken on, which is open-ended, would come up with considerably higher numbers. A couple of readers stressed that that is how the game is played and the CBO can only opine on bills as written. Hence, legislation is drafted with sunset provisions that everyone knows are a fiction.

Nevertheless. our original view, that the CBO lacks the expertise and resources to estimate what the downside for Fannie and Freddie might be was confirmed by the New York Times:
The proposed government rescue of the nation’s two mortgage finance giants should appear on the federal budget as a $25 billion expense, the independent Congressional Budget Office said on Tuesday, but officials conceded that there was no way to really know what, if anything, a bailout might cost taxpayers.

The budget office said the chances were better than even that a rescue would not be needed before the end of 2009 and would not cost any money. But the office also said there was a 5 percent chance that the mortgage giants, Fannie Mae and Freddie Mac, could lose $100 billion...

The budget office, while acknowledging that the $25 billion was, at best, a rough estimate, did not explain fully how it came up with the figure. The office said it analyzed the companies’ financial statements and consulted with regulators, analysts, market participants and the companies themselves to estimate possible future losses and the amount of any cash injection that might be needed from the Treasury....

Senator Jim DeMint, Republican of South Carolina, said lawmakers were generally supportive of the overall rescue plan, but he added that he had doubts about the $25 billion estimate. “Everyone knows it’s just a wild guess,” Mr. DeMint said. “We are either going to spend zero or we’re going to spend a whole lot more than they are talking about.”

Thursday, July 17, 2008

"The End of the World as We Know It?"

In a TheStreet.com($) story with the above headline (hat tip reader MIchael), Doug Kass noted that the price increase in the Financial Select Sector SPDR was 13%, an 11 standard deviation event.

According to Kass, the odds of an eleven standard deviation event is equivalent to the world ending – between three and four times.

We noted in an earlier post:
Reader Juan provided this quote from presentation by Frank Veneroso last year to the World Bank:
[O]ne may ask, is it a bubble? Two years ago the noted money manager Jeremy Grantham posed this question in an interesting way. He presented a chart of the real inflation adjusted oil price going back to 1875.

He then noted: “Over the years we have asked over 2000 professionals for an exception to our claim that every asset class move of 2 sigmas away from trend had broken, and not one of the 2000 has ever offered an exception! This should be scarier than the fact that GMO has tried so hard to find one and failed. But we always have said that intellectually you can imagine a paradigm shift in an asset class price, even if we have been unable to document one yet in history. ...

Financial stocks are not an asset class, so they may tolerate larger moves before exhibiting mean reversion. But no matter how you look at it, 11 standard deviations is a pretty big move.

Sunday, June 1, 2008

Taleb's Harsh Assessment of Bankers, Economists, and the Fed

Reader Michael called to my attention a wide-ranging interview with Nassim Nicholas Taleb, author of the Black Swan and professional iconoclast, in the Times of London. The article is colorful, wide-ranging, and a bit long, so I've excerpted some of the most provocative bits. Needless to say, I am particularly taken by his dim view of academic economics as practiced in the US, which tends to place a premium on abstraction and models:
A noisy cafe in Newport Beach, California. Nassim Nicholas Taleb is eating three successive salads, carefully picking out anything with a high carbohydrate content.

He is telling me how to live. “The only way you can say ‘F*** you’ to fate is by saying it’s not going to affect how I live. So if somebody puts you to death, make sure you shave.”...

The world is random, intrinsically unknowable. “You will never,” he says, “be able to control randomness.”

To explain: black swans were discovered in Australia. Before that, any reasonable person could assume the all-swans-are-white theory was unassailable. But the sight of just one black swan detonated that theory. Every theory we have about the human world and about the future is vulnerable to the black swan, the unexpected event. We sail in fragile vessels across a raging sea of uncertainty. “The world we live in is vastly different from the world we think we live in.”

Last May, Taleb published The Black Swan: The Impact of the Highly Improbable. It said, among many other things, that most economists, and almost all bankers, are subhuman and very, very dangerous. They live in a fantasy world in which the future can be controlled by sophisticated mathematical models and elaborate risk-management systems. Bankers and economists scorned and raged at Taleb. He didn’t understand, they said. A few months later, the full global implications of the sub-prime-driven credit crunch became clear. The world banking system still teeters on the edge of meltdown. Taleb had been vindicated. “It was my greatest vindication. But to me that wasn’t a black swan; it was a white swan. I knew it would happen and I said so. It was a black swan to Ben Bernanke [the chairman of the Federal Reserve]. I wouldn’t use him to drive my car. These guys are dangerous. They’re not qualified in their own field.”

In December he lectured bankers at Société Générale, France’s second biggest bank. He told them they were sitting on a mountain of risks – a menagerie of black swans. They didn’t believe him. Six weeks later the rogue trader and black swan Jérôme Kerviel landed them with $7.2 billion of losses.

As a result, Taleb is now the hottest thinker in the world. He has a $4m advance on his next book. He gives about 30 presentations a year to bankers, economists, traders, even to Nasa, the US Fire Administration and the Department of Homeland Security. But he doesn’t tell them what to do – he doesn’t know. He just tells them how the world is. “I’m not a guru. I’m just describing a problem and saying, ‘You deal with it.’”...

He has rules. In California he hires bikes, not cars. He doesn’t usually carry his BlackBerry because he hates distraction and he really hates phone charges. But he does carry an Apple laptop everywhere and constantly uses it to illustrate complex points and seek out references. He says he answers every e-mail. He is sent thousands. He reads for 60 hours a week, but almost never a newspaper, and he never watches television.

“If something is going on, I hear about it. I like to talk to people, I socialise. Television is a waste of time. Human contact is what matters.”...

Startlingly, this great sceptic, this non-guru who believes in nothing, is still a practising Christian. He regards with some contempt the militant atheism movement led by Richard Dawkins.

“Scientists don’t know what they are talking about when they talk about religion. Religion has nothing to do with belief, and I don’t believe it has any negative impact on people’s lives outside of intolerance. Why do I go to church? It’s like asking, why did you marry that woman? You make up reasons, but it’s probably just smell. I love the smell of candles. It’s an aesthetic thing.”

Take away religion, he says, and people start believing in nationalism, which has killed far more people. Religion is also a good way of handling uncertainty. It lowers blood pressure. He’s convinced that religious people take fewer financial risks...

But, crucially, he also learnt from a very early age that grown-ups have a dodgy grasp of probability...For the non-mathematician, probability is an indecipherably complex field. But Taleb makes it easy by proving all the mathematics wrong. Let me introduce you to Brooklyn-born Fat Tony and academically inclined Dr John, two of Taleb’s creations. You toss a coin 40 times and it comes up heads every time. What is the chance of it coming up heads the 41st time? Dr John gives the answer drummed into the heads of every statistic student: 50/50. Fat Tony shakes his head and says the chances are no more than 1%. “You are either full of crap,” he says, “or a pure sucker to buy that 50% business. The coin gotta be loaded.”

The chances of a coin coming up heads 41 times are so small as to be effectively impossible in this universe. It is far, far more likely that somebody is cheating. Fat Tony wins. Dr John is the sucker. And the one thing that drives Taleb more than anything else is the determination not to be a sucker. Dr John is the economist or banker who thinks he can manage risk through mathematics. Fat Tony relies only on what happens in the real world.

In 1985, Taleb discovered how he could play Fat Tony in the markets. France, Germany, Japan, Britain and America signed an agreement to push down the value of the dollar. Taleb was working as an options trader at a French bank. He held options that had cost him almost nothing and that bet on the dollar’s decline. Suddenly they were worth a fortune. He became obsessed with buying “out of the money” options. He had realised that when markets rise they tend to rise by small amounts, but when they fall – usually hit by a black swan – they fall a long way.

The big payoff came on October 19, 1987 – Black Monday. It was the biggest market drop in modern history. “That had vastly more influence on my thought than any other event in history.”

It was a huge black swan – nobody had expected it, not even Taleb. But the point was, he was ready. He was sitting on a pile of out-of-the-money eurodollar options. So, while others were considering suicide, Taleb was sitting on profits of $35m to $40m. He had what he calls his “f***-off money”, money that would allow him to walk away from any job and support him in his long-term desire to be a writer and philosopher.

He stayed on Wall Street until he got bored and moved to Chicago to become a trader in the pit, the open-outcry market run by the world’s most sceptical people, all Fat Tonys. This he understood.....

In the midst of this came his purest vindication prior to sub-prime. Long-Term Capital Management was a hedge fund set up in 1994 by, among others, Myron Scholes and Robert C Merton, joint winners of the 1997 Nobel prize in economics. It had the grandest of all possible credentials and used the most sophisticated academic theories of portfolio management. It went bust in 1998 and, because it had positions worth $1.25 trillion outstanding, it almost took the financial system down with it. Modern portfolio theory had not accounted for the black swan, the Russian financial crisis of that year. Taleb regards the Nobel prize in economics as a disgrace, a laughable endorsement of the worst kind of Dr John economics. Fat Tony should get the Nobel, but he’s too smart. “People say to me, ‘If economists are so incompetent, why do people listen to them?’ I say, ‘They don’t listen, they’re just teaching birds how to fly.’ ”....

And what he knows does not sound good. The sub-prime crisis is not over and could get worse. Even if the US economy survives this one, it will remain a mountain of risk and delusion. “America is the greatest financial risk you can think of.”

Its primary problem is that both banks and government are staffed by academic economists running their deluded models. Britain and Europe have better prospects because our economists tend to be more pragmatic, adapting to conditions rather than following models. But still we are dependent on American folly.

The central point is that we have created a world we don’t understand. There’s a place he calls Mediocristan. This was where early humans lived. Most events happened within a narrow range of probabilities – within the bell-curve distribution still taught to statistics students. But we don’t live there any more. We live in Extremistan, where black swans proliferate, winners tend to take all and the rest get nothing – there’s Bill Gates, Steve Jobs and a lot of software writers living in a garage, there’s Domingo and a thousand opera singers working in Starbucks. Our systems are complex but over-efficient. They have no redundancy, so a black swan strikes everybody at once. The banking system is the worst of all.

“Complex systems don’t allow for slack and everybody protects that system. The banking system doesn’t have that slack. In a normal ecology, banks go bankrupt every day. But in a complex system there is a tendency to cluster around powerful units. Every bank becomes the same bank so they can all go bust together.”

He points out, chillingly, that banks make money from two sources. They take interest on our current accounts and charge us for services. This is easy, safe money. But they also take risks, big risks, with the whole panoply of loans, mortgages, derivatives and any other weird scam they can dream up. “Banks have never made a penny out of this, not a penny. They do well for a while and then lose it all in a big crash.”

On top of that, Taleb has shown that increased economic concentration has raised our vulnerability to natural disasters. The Kobe earthquake of 1995 cost a lot more than the Tokyo earthquake of 1923. And there are countless other ways in which we have built a world ruled by black swans – some good but mostly bad. So what do we do as individuals and the world? In the case of the world, Taleb doesn’t know. He doesn’t make predictions, he insults people paid to do so by telling them to get another job. All forecasts about the oil price, for example, are always wrong, though people keep doing it. But he knows how the world will end.

“Governments and policy makers don’t understand the world in which we live, so if somebody is going to destroy the world, it is the Bank of England saving Northern Rock. The biggest danger to human society comes from civil servants in an environment like this. In their attempt to control the ecology, they don’t understand that the link between action and consequences can be more vicious. Civil servants say they need to make forecasts, but it’s totally irresponsible to make people rely on you without telling them you’re incompetent.”

Bear Stearns – the US Northern Rock – was another vindication for Taleb. He’s always said that whatever deal you do, you always end up dealing with J P Morgan. It was JPM that picked up Bear at a bargain-basement price. Banks should be more like New York restaurants. They come and go but the restaurant business as a whole survives and thrives and the food gets better. Banks fail but bankers still get millions in bonuses for applying their useless models. Restaurants tinker, they work by trial and error and watch real results in the real world. Taleb believes in tinkering – it was to be the title of his next book. Trial and error will save us from ourselves because they capture benign black swans. Look at the three big inventions of our time: lasers, computers and the internet. They were all produced by tinkering and none of them ended up doing what their inventors intended them to do. All were black swans. The big hope for the world is that, as we tinker, we have a capacity for choosing the best outcomes.

“We have the ability to identify our mistakes eventually better than average; that’s what saves us.” We choose the iPod over the Walkman. Medicine improved exponentially when the tinkering barber surgeons took over from the high theorists. They just went with what worked, irrespective of why it worked. Our sense of the good tinker is not infallible, but it might be just enough to turn away from the apocalypse that now threatens Extremistan.

He also wants to see diplomats dying of cirrhosis of the liver. It means they’re talking and drinking and not going to war. Parties are among the great good things in Taleb’s world.

And you and me? Well, the good investment strategy is to put 90% of your money in the safest possible government securities and the remaining 10% in a large number of high-risk ventures. This insulates you from bad black swans and exposes you to the possibility of good ones. Your smallest investment could go “convex” – explode – and make you rich. High-tech companies are the best. The downside risk is low if you get in at the start and the upside very high. Banks are the worst – all the risk is downside. Don’t be tempted to play the stock market – “If people knew the risks they’d never invest.”

There’s much more to Taleb’s view of the world than that. He is reluctant to talk about matters of human nature, ethics or any of the traditional concerns of philosophy because he says he hasn’t read enough. But, when pressed, he comes alive.

“You have to worry about things you can do something about. I worry about people not being there and I want to make them aware.” We should be mistrustful of knowledge. It is bad for us. Give a bookie 10 pieces of information about a race and he’ll pick his horses. Give him 50 and his picks will be no better, but he will, fatally, be more confident.

We should be ecologically conservative – global warming may or may not be happening but why pollute the planet? – and probablistically conservative. The latter, however, has its limits. Nobody, not even Taleb, can live the sceptical life all the time – “It’s an art, it’s hard work.” So he doesn’t worry about crossing the road and doesn’t lock his front door – “I can’t start getting paranoid about that stuff.” His wife locks it, however.

He believes in aristocratic – though not, he insists, elitist – values: elegance of manner and mind, grace under pressure, which is why you must shave before being executed. He believes in the Mediterranean way of talking and listening. One piece of advice he gives everybody is: go to lots of parties and listen, you might learn something by exposing yourself to black swans.

I ask him what he thinks are the primary human virtues, and eventually he comes up with magnanimity – punish your enemies but don’t bear grudges; compassion – fairness always trumps efficiency; courage – very few people have this; and tenacity – tinker until it works for you.

“Let’s be human the way we are human. Homo sum – I am a man. Don’t accept any Olympian view of man and you will do better in society.”...

Taleb's top life tips

1 Scepticism is effortful and costly. It is better to be sceptical about matters of large consequences, and be imperfect, foolish and human in the small and the aesthetic.

2 Go to parties. You can’t even start to know what you may find on the envelope of serendipity. If you suffer from agoraphobia, send colleagues.

3 It’s not a good idea to take a forecast from someone wearing a tie. If possible, tease people who take themselves and their knowledge too seriously.

4 Wear your best for your execution and stand dignified. Your last recourse against randomness is how you act — if you can’t control outcomes, you can control the elegance of your behaviour. You will always have the last word.

5 Don’t disturb complicated systems that have been around for a very long time. We don’t understand their logic. Don’t pollute the planet. Leave it the way we found it, regardless of scientific ‘evidence’.

6 Learn to fail with pride — and do so fast and cleanly. Maximise trial and error — by mastering the error part.

7 Avoid losers. If you hear someone use the words ‘impossible’, ‘never’, ‘too difficult’ too often, drop him or her from your social network. Never take ‘no’ for an answer (conversely, take most ‘yeses’ as ‘most probably’).

8 Don’t read newspapers for the news (just for the gossip and, of course, profiles of authors). The best filter to know if the news matters is if you hear it in cafes, restaurants... or (again) parties.

9 Hard work will get you a professorship or a BMW. You need both work and luck for a Booker, a Nobel or a private jet.

10 Answer e-mails from junior people before more senior ones. Junior people have further to go and tend to remember who slighted them.

Thursday, May 8, 2008

"Blame the Models"

One of our pet interests has been how the use of mathematics and models can unwittingly enable people to fool themselves. We see this regularly when working on deals. The model for the target business' performance somehow becomes more real than the company. When the numbers don't work, if you can come up with a good sounding rationale for tweaking them, presto! Suddenly everything in hunky dory. No wonder over 60% (some studies say as high as 75%) of all deals fail.

Our colleague Susan Webber, in an article about the corporate obsession with metrics, made some pertinent observations:
Metrics presuppose that situations are orderly, predictable, and rational. When that tenet collides with situations that are chaotic, nonlinear, and subject to the force of personalities, that faith—the belief in the sanctity of numbers—often trumps seemingly irrefutable facts. At that point, the addiction begins to have real-world consequences. Business managers must recognize the limitations of metrics.

Mind you, I’m not arguing that metrics and measurement are inherently bad things. To note just one example, a well-structured performance measurement system is essential to the well-being of large enterprises. But quantitative measures can be and frequently are used naively. It’s all too easy to abdicate judgment to the output of a model or scorecard.

Jon Danielsson at VoxEU takes this viewpoint further in an article that discusses a pervasive cognitive dissonance among trading operations and their regulators. They know that statistical models have major shortcomings, particularly in underestimating the odds of catastrophic losses, which is precisely what they are supposed to help avoid. While the conventional response has been to try to devise better models, Danielsson argues that that line of thinking is wrongheaded.

For Danielsson makes a fundamental point: what matters is management; the models are secondary. For reasons I cannot fathom (perhaps the rise of the PC and the ease of slicing and dicing data), qualitative assessments are seen as inferior to quantitative ones. But for a regulator to understand the robustness of a company's management practices requires more scrutiny than has been fashionable of late. And it also requires better regulators.

From VoxEU:
In response to financial turmoil, supervisors are demanding more risk calculations. But model-driven mispricing produced the crisis, and risk models don’t perform during crisis conditions. The belief that a really complicated statistical model must be right is merely foolish sophistication.


A well-known American economist, drafted during World War II to work in the US Army meteorological service in England, got a phone call from a general in May 1944 asking for the weather forecast for Normandy in early June. The economist replied that it was impossible to forecast weather that far into the future. The general wholeheartedly agreed but nevertheless needed the number now for planning purposes.

Similar logic lies at the heart of the current crisis

Statistical modelling increasingly drives decision-making in the financial system while at the same time significant questions remain about model reliability and whether market participants trust these models. If we ask practitioners, regulators, or academics what they think of the quality of the statistical models underpinning pricing and risk analysis, their response is frequently negative. At the same time, many of these same individuals have no qualms about an ever-increasing use of models, not only for internal risk control but especially for the assessment of systemic risk and therefore the regulation of financial institutions.1 To have numbers seems to be more important than whether the numbers are reliable. This is a paradox. How can we simultaneously mistrust models and advocate their use?.....

Underpinning this whole process is a view that sophistication implies quality: a really complicated statistical model must be right. That might be true if the laws of physics were akin to the statistical laws of finance. However finance is not physics, it is more complex, see e.g. Danielsson (2002).

In physics the phenomena being measured does not generally change with measurement. In the finance that is not true. Financial modelling changes the statistical laws governing the financial system in real-time. The reason is that market participants react to measurements and therefore change the underlying statistical processes. The modellers are always playing catch-up with each other. This becomes especially pronounced when the financial system gets into a crisis.
This is a phenomena we call endogenous risk, which emphasises the importance of interactions between institutions in determining market outcomes. Day-to-day, when everything is calm, we can ignore endogenous risk. In crisis, we cannot. And that is when the models fail.

This does not mean that models are without merits. On the contrary, they have a valuable use in the internal risk management processes of financial institutions, where the focus is on relatively frequent small events. The reliability of models designed for such purposes is readily assessed by a technique called backtesting, which is fundamental to the risk management process and is a key component in the Basel Accords.

Most models used to assess the probability of small frequent events can also be used to forecast the probability of large infrequent events. However, such extrapolation is inappropriate. Not only are the models calibrated and tested with particular events in mind, but it is impossible to tailor model quality to large infrequent events nor to assess the quality of such forecasts.

Taken to the extreme, I have seen banks required to calculate the risk of annual losses once every thousand years, the so-called 99.9% annual losses. However, the fact that we can get such numbers does not mean the numbers mean anything. The problem is that we cannot backtest at such extreme frequencies. Similar arguments apply to many other calculations such as expected shortfall or tail value-at-risk. Fundamental to the scientific process is verification, in our case backtesting. Neither the 99.9% models, nor most tail value-at-risk models can be backtested and therefore cannot be considered scientific.

We do however see increasing demands from supervisors for exactly the calculation of such numbers as a response to the crisis. Of course the underlying motivation is the worthwhile goal of trying to quantify financial stability and systemic risk. However, exploiting the banks’ internal models for this purpose is not the right way to do it. The internal models were not designed with this in mind and to do this calculation is a drain on the banks’ risk management resources. It is the lazy way out. If we don't understand how the system works, generating numbers may give us comfort. But the numbers do not imply understanding.

Indeed, the current crisis took everybody by surprise in spite of all the sophisticated models, all the stress testing, and all the numbers. I think the primary lesson from the crisis is that the financial institutions that had a good handle on liquidity risk management came out best. It was management and internal processes that mattered – not model quality. Indeed, the problem created by the conduits cannot be solved by models, but the problem could have been prevented by better management and especially better regulations.

With these facts increasingly understood, it is incomprehensible to me why supervisors are increasingly advocating the use of models in assessing the risk of individual institutions and financial stability. If model-driven mispricing enabled the crisis to happen, what makes us believe that the future models will be any better?

Therefore one of the most important lessons from the crisis has been the exposure of the unreliability of models and the importance of management. The view frequently expressed by supervisors that the solution to a problem like the subprime crisis is Basel II is not really true. The reason is that Basel II is based on modelling. What is missing is for the supervisors and the central banks to understand the products being traded in the markets and have an idea of the magnitude, potential for systemic risk, and interactions between institutions and endogenous risk, coupled with a willingness to act when necessary. In this crisis the key problem lies with bank supervision and central banking, as well as the banks themselves.

Sunday, April 27, 2008

Does Measuring Service Productivity Lead Us Astray?

In "Japan may be rigid but it is not inefficient," David Philig takes issue with metrics that find Japan's service economy to be woefully inefficient. The commonly used yardstick is labor productivity, and Japan allegedly scores badly due to its tendency to have high staff ratios (for instance, those ladies in hotels who walk you to the elevator and bow).

Yet Philig finds that the figures lead to patently nonsensical conclusions, for instance, that Japan's rail and medical systems are less efficient than those of the US. The article discusses some of the reasons that cross country comparisons are unreliable. I wish the piece had gone into more detail here. because the discussion made it sound as if it was conflating several notions that needed to be picked apart: difficulty in standardizing for quality of output (the extraordinary reliability of Japan's trains are clearly worth more in a very tangible sense than erratic service elsewhere) and social benefit (having a store within a short walk is a wonderful convenience and saves on gas too).

While academics are likely to focus on trying to find better ways to normalize for the quality of results, the notion of broader benefits, or positive externalities, is largely ignored. And that is a much bigger deal than is commonly acknowledged.

As most businesses know, what gets measured gets attention. Even if management isn't sure yet whether it matters, if employees learn that the bosses are now clocking, say, the length of their average phone call, it will affect behavior.

Even though positive and negative externalites are well accepted economic concepts, far less effort has gone into trying to measure them than, say, unemployment or poverty. Admittedly, some of these by nature are easier to capture, but how easy is it to measure GDP in the absence of a large data-gathering and statistical apparatus? As our experience with CO2 emissions (and historically, other pollutants) has shown us, negative externalities can literally have global impact, yet they have gotten short shrift since identifying and costing them would no doubt heighten and sharpen some simmering debates. Similarly, failing to capture the benefit of positive externalties can lead to misguided policies. But as in the old joke about the drunk looking for his keys under the street light, most people prefer to look only where it is convenient.

From the Financial Times:
It is the men with red-glowing Darth Vader nightsticks who provoke particular scorn. These are the people, employed by Japanese construction companies, who stand by roadworks or building sites, waving pedestrians and traffic out of harm’s way. So vital is their function that sometimes they are replaced by plastic cut-outs.

Then there are the elevator ladies, with their doll-like mannerisms, who press the lift buttons, the shop assistants who work in pairs, and the hotel attendants with so much time on their hands they physically walk guests to the lavatory or cigar bar.

These are the examples regularly invoked to illustrate Japan’s supposed service-sector failings. While Japanese manufacturing is held up as world class, its service sector, which accounts for 70 per cent of output, is regularly lampooned as being years behind the efficiencies achieved in the US and even sleepy Europe.

The latest Organisation for Economic Co-operation and Development report on Japan, released this month, treads familiar ground. It states that “boosting productivity in the service sector is a key priority for promoting long-term growth” as the workforce ages and shrinks. While manufacturing labour productivity per hour increased from 1999 to 2004 by 4 per cent annually, keeping pace with the US, it notes, service-sector productivity lagged behind badly, rising just 0.9 per cent.

There is a problem with such analysis. You need only to read, in a previous finding, that Japan’s transport system is 30 per cent less efficient than that of the US to smell a rat. Common sense tells you that passenger transport is vastly superior in Japan, where tens of millions of people are moved daily at reasonable cost. The Shinkansen bullet train, for example, with 300 daily services between Tokyo and Osaka, makes the 552km journey in 2½ hours with an average delay measured in seconds.

Japan’s health service is also regularly portrayed as inefficient. Patients languish in hospital beds far longer than they would in the US. Yet, according to government statistics, Japan spends 7.9 per cent of gross domestic product on healthcare against 15.2 per cent in the US. Life expectancy in Japan, admittedly a crude measure of healthcare quality, is 79 for men and 86 for women, respectively four and six years higher than in the US.

Clearly something is wrong with the methodology. Kyoji Fukao, professor at Hitotsubashi University’s Economic Research Institute, thinks so too. The team he heads provides much of the Japanese data that go into international comparisons. He argues that the usual measures of service-sector efficiency – value added per man hour and total factor productivity, which incorporates capital and labour inputs – are crude and hard to compare across borders.

He cites Japan’s retail sector, regularly branded as inefficient. The basic measure of retail-sector productivity is how much of a product an employee can shift in an hour. On this measure, Germany does well. That turns out to be because of restricted opening hours, which oblige customers to make hefty purchases in one go. Japan does badly. Cavernous US superstores do better than cramped noodle or tofu shops. Japan also has a dense network of convenience stores on almost every city block, open 24 hours , allowing people to shop whenever they want. This makes them inefficient, since purchases are less concentrated.

No allowance is made, either, for the fact that Japanese shops tend to be within walking or, at most, cycling distance. Figures do not capture the inconvenience of having to travel, or the externalities associated with long shopping expeditions: traffic accidents, pollution, road maintenance.

To complicate matters further, a European-funded study that compares productivity internationally, called EU KLEMS, throws up odd results. According to data released in March 2007 on productivity in Japan’s wholesale, retail and transport sector, labour productivity grew just 0.5 per cent annually from 1995 to 2004. The same study, released a year later, showed productivity clipping along at 2.1 per cent from 1995 to 2005, considerably better than Europe’s.

This is not to claim that Japan’s service sector is beyond reproach. Labour market rigidities make it harder to shift people from unproductive parts of the economy to more productive ones. Perhaps the men with nightsticks should be working as hospital nurses, who are thin on the ground. When it comes to information technology, US companies change their working practices to fit new computer programmes, often pruning staff in the process. Japanese companies have IT programmes designed to suit existing working practices and employee numbers.

Some Japanese services are too expensive. The tendency to favour producers over consumers damages everybody. High harbour charges have pushed transit freight to cheaper ports in China. Exorbitant airport landing fees make it virtually impossible for low-cost airlines to operate. The fourth most expensive electricity charges among OECD countries raises the cost of doing business.

Japan should tilt the balance back towards consumers by fostering competition, encouraging foreign investment and removing barriers to new entrants. Better, cheaper services could stimulate domestic demand, reducing Japan’s export dependence. All that is true. But it is not the same as making near-meaningless comparisons about productivity. When someone tells you Japan’s transport system is inefficient, think Amtrak.

Wednesday, April 16, 2008

Stressed Banks Underreporting Libor Rates

The Wall Street Journal reports on another sign of how bad the credit crunch has gotten: banks fudging on what they are reporting as their short-term cost of interbank borrowing, out of fear of revealing how stressed they are. So the Libor becomes less useful as a guide. That in turn means that the so-called TED spread (the difference between three month Libor and ninety-day Treasuries), which is one of the preferred measures of stress in the interbank markets, is understated by as much as 30 basis points. So just imagine what this chart looks like if you add that amount to, say, the 2008 data points (chart courtesy The Financial Ninja). It goes from ugly to uglier:



The Journal mentions another consequence, that Libor-indexed borrowers are getting a better rate than they deserve. But it misses an implication that is ultimately more serious: as more and more statistics and benchmarks come into doubt, it creates uncertainty and undermines planning, which in turn is a deterrent to investment.

I saw this when working briefly in Mexico in 1984. The local McKinsey office confirmed that there was no reliable data in the entire economy. As one colleague noted, "We do a lot based on feelings." It made the US premium for equity-related investing (10-15% over the local equity premium) seem entirely logical. Similarly, marked inflation also corrodes the usefulness of quantitative information.

From the Journal:
In a development that has implications for borrowers everywhere, from Russian oil producers to homeowners in Detroit, bankers and traders are expressing concerns that the London inter-bank offered rate, known as Libor, is becoming unreliable...

Some banks don't want to report the high rates they're paying for short-term loans because they don't want to tip off the market that they're desperate for cash. The Libor system depends on banks to tell the truth about their borrowing rates....

No specific evidence has emerged that banks have provided false information about borrowing rates, and it's possible that declines in lending volumes are making some Libor averages less reliable. But bankers and other market participants have quietly expressed concerns to the British Bankers' Association....

Questions about Libor were raised as far back as November... In a recent report, two economists at the Bank for International Settlements, a sort of central bank for central bankers, also expressed concerns that banks might report inaccurate rate quotes.....

In a recent research report on potential problems with Libor, Scott Peng, an interest-rate strategist at Citigroup Inc. in New York, wrote that "the long-term psychological and economic impacts this could have on the financial market are incalculable." Mr. Peng estimates that if banks provided accurate data about their borrowing costs, three-month Libor would be higher by as much as 0.3 percentage points....

Libor has become such a fixture in credit markets that many people trust it implicitly. Concerns about its reliability are "actually kind of frightening if you really sit and think about it," says Chris Freemott, a Naperville, Ill., mortgage banker who depends on Libor to tell him how much his firm, All America Mortgage Corp., owes First Tennessee bank for a credit line that he uses to make loans.....

Today, Libor rates are set for 15 different loan durations -- from overnight to one year -- and in 10 currencies, including the pound, the dollar, the euro and the Swedish krona. They serve as the basis for payments on trillions of dollars in corporate loans, mortgages and student loans. Libor rates are also used to set the terms of more than $500 trillion in "derivatives" contracts such as interest-rate swaps, which companies all over the world, including U.S. mortgage guarantors Fannie Mae and Freddie Mac, use to protect themselves against sudden shifts in the difference between long-term and short-term interest rates.....

.... jitters have made many banks unwilling to extend loans to each other for more than one week. As a result, the rates they quote for loans of three months or more are often speculative, because there's little to no actual lending for that time period, brokers say....

In one sign of increasing concern about Libor, traders and banks are considering using other benchmarks to calculate interest rates, according to several traders. Among the candidates: rates set by central banks for loans, and rates on so-called repurchase agreements, under which borrowers provide banks with securities as collateral for short-term loans.

In a report published in March by the Bank for International Settlements, economists Jacob Gyntelberg and Philip Wooldridge raised concerns that banks might report incorrect rate information. The report said that banks might have an incentive to provide false rates to profit from derivatives transactions. The report said that although the practice of throwing out the lowest and highest groups of quotes is likely to curb manipulation, Libor rates can still "be manipulated if contributor banks collude or if a sufficient number change their behaviour."

Saturday, April 12, 2008

Quelle Surprise! Unemployment Stats Don't Capture Joblessness

The New York Times has finally deigned to report on the fact that the Bureau of Labor's unemployment rate (aka "headline unemployment") does an incomplete job of capturing the proportion of the population out of work.

The article by Floyd Norris, "Many More Are Jobless Than Are Unemployed," is less than complete. Despite its professed objective of shedding light on how the official unemployment releases understate the extent of inability to find work, the article curiously gives short shrift to explaining how employment data is captured. Nevertheless, it does provide a very useful chart that shows how what the Times calls the jobless rate, which is the proportion of the population without jobs, versus with the published unemployment rate:



In the top chart, it's not hard to see that the gap between the two lines. Not only did has it widened since 1982, but the unemployment rate trends broadly downward while the jobless rate rises. Yet the MSM has chosen to ignore how headline unemployment paints a flattering picture of the labor market until consumer disillusionment with the economy has become acute.

From the New York Times:
The unemployment rate is low. The jobless rate is high....

Men in the prime of their working lives are now less likely to have jobs than they were during all but one recession of the last 60 years. Most of them do not qualify as unemployed, but they are nonetheless without jobs.

The unemployment rate paints a less gloomy picture. Among men ages 25 to 54 — a range that starts after most people finish their education and ends well before most people retire — the unemployment rate is 4.1 percent. That is not especially low, but it is well below the peak rate in all but one post-World War II recession.

Only people without jobs who are actively looking for work qualify as unemployed in the computation of that rate. It does not count people who are not looking for work, whether or not they would like to have a job....

In the latest report, for March, the Labor Department reported the jobless rate — also called the “not employed rate” by some — at 13.1 percent for men in the prime age group. Only once during a post-World War II recession did the rate ever get that high. It hit 13.3 percent in June 1982, the 12th month of the brutal 1981-82 recession, and continued to rise from there...

As can be seen in the accompanying chart, there has been a long-term decline in the proportion of prime-age men with jobs. That decline has been masked by rises in the number of older people with jobs and by a steady rise in the proportion of women working outside the home. But even among women there has been some slippage. The proportion of women ages 25 to 54 without jobs was 27.4 percent in March, a figure that is higher than it was during all but one month of the 2001 recession.

The negative trend can also be seen in the other chart, which shows the annual change in the number of working men in the 25 to 54 age range, using a three-month moving average to smooth the figures.

In the last half-century, that figure has turned negative only after recessions have been going on for at least a few months, although it has often stayed negative well after the recession officially ended. The lags have ranged from four months after the start of the 1960-61 and 2001 recessions, to 15 months after the beginning of the 1973-75 downturn, with an average lag of eight months. This year, the figure turned negative in January.

Thursday, March 13, 2008

"Is US Inflation at 8%?"

Wolfgang Munchau, who writes for the Financial Times as well as the blog Eurointelligence, ruminates about inflation statistics and argues that economists and statisticians may be going down the wrong path in dismissing consumers' subjective perceptions. He also has considerable doubts about hedonic adjustments (basically, the methodology for adjusting for the fact that computers and other devices have become cheaper for the same performance, and that even seemingly mature products, such as cars, have features they lacked a decade ago (think GPS and more self-diagnostics).

While his post is helpful, he misses some of the other adjustments that, at least in the US, lead to the official understatement of inflation. The Boskin Commission in late 1996, which was chartered to adjust the CPI calculation (remember, at this point Social Security payments were indexed to CPI) rather conveniently concluded that CPI overstated inflation by 1.1% in 1996 and roughly 1.3% per year in prior years. And of course, the CPI methodology was then adjusted to produce lower numbers, therefore reducing Social Security payment increases.

What did the Boskin Commission think was out of line? According to Wikipedia:
The report highlighted four sources of possible bias:
Substitution bias occurs because a fixed market basket fails to reflect the fact that consumers substitute relatively less for more expensive goods when relative prices change.

Outlet substitution bias occurs when shifts to lower price outlets are not properly handled.

Quality change bias occurs when improvements in the quality of products, such as greater energy efficiency or less need for repair, are measured inaccurately or not at all.

New product bias occurs when new products are not introduced in the market basket, or included only with a long lag.

So the Boskin report would have us believe that if I switch from steak to hamburger because beef prices are up, we should only capture the change in how I consume (ie, inflation is new hamburger/old steak price, not new steak/old steak). That is patently bogus. Similarly, the outlet substitution seems rife for abuse ("Ooh, the number is going to be really bad this month! Can we find anywhere selling X cheaper so we can put that in the model instead?").

From Munchau:
There is a debate, usually in blogs, and usually with a whiff of conspiracy, about whether our inflation numbers are real, forged, or statistically so skewed as to underrepresent the true rate of inflation by quite a wide margin. I want to pick on this debate in this entry, not so much in support of one of those conspiracy theories, but in support of a more wideranging debate about how we measure inflation....

In the last five years, we have observed a phenomenon that we were not familiar with before, the phenomenon that people "feel" inflation to be higher than officially measured indices tell us. We hear a frequently used explanation: Cognitive science tells us that we give a higher weight to prices we actually see in supermarkets, or at petrol stations, than to prices with no explicit tags on them, such as rents, or telephony. The explanation is that the felt inflation is a purely psychological phenomenon.

You probably all remember that we heard exactly the same argument when we switched from national currencies to the euro. We felt there was substantial inflation, and as it turned out some shopkeepers used the confusion to raise prices, so there was a modest amount of real inflation. But if this had been a change-over phenomenon - we are talking 2002 - it would gone away. It did not.

I myself thought at the time that I was facing a significant rise in costs.... Now you can say: Well this is just you. You are not average. This does not apply to Mr and Mrs Average, and, in fact, I did believe this too.

But Mr and Mrs Average kept on complaining. The raise in euro prices was a factor during the 2005 No Vote in the Dutch referendum on the European constitution. It was almost certainly not the decisive factor, but people mentioned it when asked. In France, in particular, the biggest economic debate today is not the subprime crisis, but the apparent loss of purchasing power, which is economic illiteracy for a "rise in inflation".....

Experts, and this applies to economists just as much as any engineer or scientist, often dismiss public comments about their subject area with varying degree of snobbish arrogance. This is particularly true about the debate about inflation. It is all in our heads, they say. The numbers don't lie. The statistics are correct. We are unstable, not the index.

Well, I have my doubts - and this is not a psychological argument, but a statistical one. The first thing to notice is that inflation is not an observable real world variable, such as the number of widgets produced by a factory. Inflation is a statistic - technically a mapping from a probability space of random events into the positive real numbers. To arrive at a statistic, i.e. a number, we have to take multiple decisions, such as which sample of goods to include in our basket, since we cannot measure the universe of prices. We also have to choose a method how to weigh the results mathematically. You might remember the Paasche or Laspeyres price indices taught in Economics 101. In particular, we have to choose what to put into the basket, and what not.

In the 1950s, this exercise was easy. In the UK, I was told by someone who was actually involved in this exercise that they had chosen a typical working class family, and looked at their consumption basket, which was relatively uniform by today's standards. They would pay rent, consume a certain amount of energy, obviously much of the spending went into foods, household goods, and some durables. The RPI, the retail price index, is still used today by ordinary people as their favourite measure of inflation (and also by wage negotiators). It has been significantly higher than the CPI, the index targeted by the Bank of England.

The reason for this discrepancy is, of course, related to what we put into the basket and to the adjustments we choose to make. We make lots of adjustments. If the price of a family computer at your local hardware costs €1000 today, and €1000 in one year's time, we calculate this as a fall in prices, because the quality of the computer has presumably increased. I have problems with this now ubiquitous concept of hedonistic pricing because we are double-counting. The improvement in quality is the result of a rise productivity - which is a real variable. So the improvement in quality raises nominal growth in the numerator, and it lowers the price in the denominator, in other words, we double-count the effect. It may well be that we have been consistently underestimating the rate of inflation, and overestimating the rate of real productivity growth. Since the US uses the hedonic pricing more consistently than the Europeans (I think, please correct me if I am wrong on this one), the problem would be worse in the US than in Europe.

There is a website called Shadow Government Statistics, for whose accuracy I cannot vouch, which claims that the pre-Clinton era inflation index shows current inflation at close to 8%, while opposed official CPI inflation is only half that level. Here is the chart. What makes me a bit doubtful is that the higher series is an almost perfect image of the lower series (just follow it turn for turn), so that it may be calculated as actual inflation plus x%. That would not be a very acurate way to do this.



But let us suppose for a moment that series is correct. If US inflation were really 8%, this would mean that interest rates in the US have been negative at all times in the last 10 years. It would mean that 10-year treasuries, which yield only a little over 4%, are massively mispriced, that a bond price crash of historic proportion would beckon, essentially wiping out a large amount of China's and Russia's wealth - countries that have heavy investors in the US. It would be a global economic catastrophe. So we are not going to switch back with ease and pleasure. There are many vested interests in not doing so.

I do not want to discuss the merit of this particular statistic - which I cannot - but I believe strongly that the Fed is absolutely wrong to target a core-inflation index (and it is not even doing that with any great conviction and success). Core inflation is supposed to be more stable, as it excludes volatile categories of food and energy, but both categories have not been volatile, but persistently rising. To exaggerate a little (well, ok, a lot): All the troublemakers are taken out of the basket, the rest is adjusted.

But if some of the criticisms of the modern inflation indicators are even remotely correct, it would not only mean that we are about to return to a 1970s period of stagflation, with its double-digits inflation rates in the US and in some European countries. With the Fed now swamping the market with cheap money as though there is no tomorrow, it could be a lot worse than that.

One reader wrote to me that the 8% estimate for US inflation is probably still too optimistic, as it does not fully take into account the rise in wheat and other commodity prices, for example. Another important side effect of a potentially misjudged inflation series is that US growth is actually not higher than European growth - a claim that has lead to much soul-searching over here - as we are deflating nominal GDP growth by an excessively modest indicator. As for the apparently superior performance of the British economy, just try to deflate all those nominal prices by RPI, not the actual GDP-deflator used, and the economic miracle disappears.

There is surely some of this going on in the euro area as well, but the effect is probably less extreme, I think. At the very least, the ECB is not taking oil and food out of the price index, but I think we do use hedonistic pricing too. I have not seen any estimate of German or French inflation in 1980s, or early 1990s terms, and would be very interested if readers could alert me if such estimates exist. My gut instinct tells me that our inflation rate also understates the true rate of inflation, but perhaps to a lesser degree than in the US. But that assertion only cries out to be verified, or to be dismissed.

Even if we are sceptical about some of those numbers, let us at the very least have an honest debate about inflation. While an artificially depressed inflation indicator may make life a lot easier for a central bank, we know we cannot fool all of the people of the time. This was just tried in the credit market. Another catastrophic Ponzi game would eventually come unstuck. The last think we want after this credit crisis is over, is for central banks to put up nominal short term rates to 20% to contain runaway inflation. Contrary to popular wisdom in the US, it may be better not to cut them now, as opposed to cutting now, and hiking later.

Monday, March 10, 2008

"Financial Models Should Come With Health Warnings"

All About Alpha, which is a fine site for all things hedge fund related, has an excellent piece today by Dr. William Shadwick of Omega Analysis. Shadwick has the unusual distinction of being a serious mathematician (he established Fields Institute for Research in Mathematical Sciences) who writes well. He also entered the finance industry relatively late in his professional life, which enables him to take a more detached view.

The piece does a nice job of giving some of the major failing in models that show up as embedded assumptions. What is perhaps particularly interesting to the layperson is some of the recognized problems (such as models that presuppose that distributions are normal) are often uncorrected. In addition, some of the modifications (allowing for skewness, or asymmetrical distributions, and kurtosis, or fat tails) are unreliable in practice. Practitioners typically have too few data points to make solid assessments, and the model outputs are highly sensitive to how the distributions are tweaked.

From AllAboutAlpha:
The title of this piece comes from a joke about a “highly qualified” financial engineer’s reaction to a well-proven trading strategy. It illustrates the tension between theory and practice that we have all seen as quantitative methods become ever more common at trading desks and in investment management.

In general, the rise of quantitative tools in finance has been highly beneficial but the widespread use of models has been a decidedly mixed blessing. In science, the constant development of theories expressed as mathematical models to be tested, rejected, confirmed or refined through observation and experiment is the main source of progress in our understanding of the physical world. This process is also crucial for engineering and technology where it is the key to predicting future events and controlling them to our advantage.

It was inevitable that this paradigm would eventually be adopted in economics and finance too. In the half century since Markowitz put portfolio construction on a quantitative footing there has been a steady growth in the use of increasingly sophisticated and complex models and statistical techniques in investment management.

The nature of the financial markets is such that this growth in models has not been accompanied by the sort of testing that the field science demands. Finance academics simply cannot perform experiments like those upon which the sciences rely, and they are also severely constrained in the type of observations they can make. Data and information about what goes on in reality (as opposed to theory) is, and is likely to remain, in very short supply in comparison to the sciences.
The end users of the models in finance are not intent on understanding how markets may be explained. That is the goal of academic research. Instead, they want to employ theory and models to produce profits. Like physical engineering, which has had its share of collapsing bridges, financial engineering has therefore led to many accidents.

For example, the current mess in the credit markets would not have been possible without the extensive and inappropriate rise of “sophisticated” models. The results of mis-priced risk have now been cascading through the financial system for several months and show no sign of abating.

Assumptions = “Hidden Models”

A mathematical model can be thought of as a process that states:

“If assumption A is satisfied, then input of B is guaranteed to be followed by output of C.”

A “robust” model is one in which:

“If A is close enough to being satisfied and the input is close enough to B then a result close to C is guaranteed.”

The most basic requirement for the use of mathematical models - in any context - is that they are appropriate for the job. The model tells you nothing about what happens if assumption A is not even close to being satisfied. In this case, no matter how diligently one applies the model with the expectation of getting from B to C, the process cannot be trusted. If the model is not robust, the difference in output may even be very dangerous.

Hidden Model: Return distributions are “independent and identically distributed”

The increased use of quantitative methods means that models are now almost ubiquitous and are often present but hidden. For example, almost every hedge fund investor or manager has used the “square root of 12 rule” to produce an annualized volatility figure from a sample of monthly returns. But how many people remember to check the assumption upon which the rule is based? The returns must be independent draws from the same distribution (i.i.d., or “independent and identically distributed”) for this rule to be justified.

This is an example of a “hidden model”. It is not the returns of a hedge fund that we are talking about but the model of returns of a hedge fund. Does anybody really believe that hedge fund returns have no auto correlation? Does anybody really believe that there is an unchanging distribution from which the returns are drawn?

Returns on hedge fund investments or stock market prices are not random variables. However the extent of apparent randomness in their behavior means that statistical tools are most appropriate for describing them and for making predictions of the future. In the case of the “square root of 12 rule”, the prediction is the annual volatility expected over many years. While nobody would feel that a sample of 3 annual returns would merit the calculation of an annual volatility, we’re happy to use a sample of 36 monthly returns and the model of returns as i.i.d. random variables to make the prediction.

The danger in such an assumption is that it can easily underestimate the true annual volatility of returns. This may produce serious strains on an investment program because the path by which the NAV goes from its initial value to its value 5 or 10 years later often matters a great deal. It matters to a manager who may spend significant time without receiving a performance fee after a large loss or a series of smaller ones. It matters to the investor who requires some of the proceeds of his investment for income during the period. This is, of course, the reason for wanting an estimate of annual volatility in the first place. (It is difficult to find an example of an investor who only needs to know that his investment NAV will rise “in the long term” while being unable to count on using the proceeds at any intervening time before the long term - when, as Keynes said, “we’re all dead.”)

Hidden Model: Return are normally distributed

There are more dangerous assumptions than returns being independent and identically distributed. One might also assume that they were normally distributed. Probably everyone has heard the “black swans” argument about the importance of extreme events in markets and Mandelbrot and Taleb’s attacks on the reliance on normal distributions in finance theory.

I think they have greatly overestimated the number of academics who haven’t yet noticed that market returns aren’t normal. However there is no doubt that the persistence of press and industry descriptions of large market losses in terms of standard deviations (and ascribing an extremely low probability to such an event in consequence) indicates a widespread hidden assumption of normality.

This is dangerous for the obvious reason that it call lead to a feeling of safety where none exists. If you know that there is a 1 in 10 chance of a catastrophic loss instead of believing the chance to be 1 in 1000, the expected return you require for taking such a risk will be very different. There is no doubt that many of the estimates of loss responsible for the sub-prime debacle required exactly this sort of mis-pricing.

Hidden Model: Standard deviation is a proxy for risk

In great part, these dangers are a consequence of another hidden model, namely the use of standard deviation of returns as a proxy for risk. The realization that this model of risk is especially dangerous when applied to hedge funds has led both academics and finance practitioners to make use of skewness and kurtosis in an attempt at more “sophisticated” modeling of risk.

Skewness is intended to model asymmetry - the mismatch of upside and downside risk. Kurtosis is intended to model the likelihood of extreme events or “fat tails”. Of course, certain assumptions must be satisfied for these models to perform as intended. Dangers introduced by relying on these metrics are compounded by the great sensitivity of skewness and kurtosis to (even moderate) outliers. These are not statistics meant for small samples.

Hidden assumptions in hedge fund replication

The extent to which distributional replicators will succeed in reproducing hedge fund returns will depend on the extent to which the “noisiness” of skewness and kurtosis can be managed. An even more critical assumption (underpinning distributional replication) is that distributions with the same mean, variance, skewness and kurtosis must be very similar. This is not true in general. So replicators must depend on this assumption being satisfied - at least approximately - for the distributions that matter to them.

The use of standard deviation to describe risk is also an essential part of the risk-factor approach to hedge fund replication. In fact, the term risk-factor itself equates risk with standard deviation of returns. In this case, the (linear regression) model could be said to be “hidden in plain sight”, but it is no more easily remembered for that.

Does everyone who uses the term “alpha” really mean it to be interchangeable with an artifact of a particular model of returns? For that matter, does everyone accept the hidden model in the statistician’s use of the word “explain” when he says that “certain risk-factors explain some percentage of hedge fund returns?”

It sounds rather different if he instead says that he has a model which (while nothing can be known about its actual similarity to a particular investment strategy or indeed how likely its assumptions are to be satisfied), manages to approximately reproduce the strategy’s mean, standard deviation, and correlation with a number of financial indices.

Bottom Line: Hidden assumptions should give rise to “health warnings” on quantitative models

Models are everywhere in quantitative finance but it is almost impossible to find any attendant statements regarding the assumptions upon which they are based. Their purveyors should issue “health warnings” that tell the user that hidden assumptions are present and that failing to check that the assumptions are valid may be dangerous to investment health. It is essential that we recognize the difference between finance and science. In science, increasingly sophisticated mathematical techniques always produce better results over time. But this need not be the case in finance. Nevertheless finance can and should aspire to the status of an engineering discipline.

While you are unlikely to find health warnings on financial models any time soon, there are a few simple principles which can reduce the danger they present:
It is far more important to look to simplicity (and common sense) than it is to look to increasing complexity as a means to better control investment outcome.

A model whose robustness is unknown or unknowable should never be employed.

Sophisticated tools should only be used if it is possible to verify that all required assumptions are satisfied (at least to a good approximation). When this condition can be met, a simple application of a sophisticated technique is preferable to a complicated one.

Keeping these in mind will reduce the risk that financial models may pose to your investment health!