Deeply flawed risk benchmark

September 30, 2014 07:00 PM

“In July 2002, the [Dow Jones] index recorded three steep falls within several trading days. (Probability: one in four trillion) And on October 19, 1987, the worst day of trading in at least a century, the index fell 29.2%. The probability of that happening, based on the standard reckoning of financial theorists, was less than one in 10 to the 50th –odds so small they have no meaning. It is a number outside the scale of nature….Everyone knows [that] financial markets are risky. But in the careful study of that concept, risk, lies the knowledge of our world and hope of a quantitative control over it. ” 

— Benoit Mandelbrot

Understanding risk in the field of finance and investments is of paramount importance. Investors need to know the risk associated with various investments in order to understand the likelihood of future events (losses in particular) and to plan accordingly as well as analyze competing investments with different risk-return profiles on equal grounds. 

To this end, financial practitioners and academics have developed statistical approaches to model risk based on an asset’s past price action. Foremost among them is the use of standard deviation as a proxy for an investment’s risk. To be specific, standard deviation of the historical returns of an investment is used to define "risk" in many cases. Consider for example the fact that the Sharpe ratio, which adjusts the expected return of an asset by its risk, uses the standard deviation of returns as the risk metric. Another instance can be found in modern portfolio theory and the efficient frontier problem. The goal is to find the asset combinations (i.e., portfolio weightings) that yield the least amount of risk for given expected returns–risk again being defined as the standard deviation of the total portfolio’s returns. 

In a strict statistical sense standard deviation measures how much random outcomes differ from their average. So for instance, a never-ending series of returns of 2%, 2%, 2%, .… 2% has an average of 2% and a standard deviation of 0 (because there is never any variation from the mean). In exactly the same way a never-ending series of returns of -2%, -2%, -2%, .…, -2% also has a standard deviation of 0 because there is never any variation about the mean, which is -2%. On the other hand, the return stream 0%, 4%, 0%, 4%, .…, 0%, 4% also has an average of 2% but a non-zero standard deviation (of 2.0) because there is some variation from this mean. The stream 1%, 3%, 1%, 3%, …., 1%, 3% again has a mean of 2%, but a smaller standard deviation (of 1.0) because the individual outcomes do not vary from the 2% mean as much as the 0%, 4% series. Thus we can see that standard deviation is really just a measure of the amount of "spread" associated with random outcomes."

But is standard deviation really a "good" measure of financial risk? We begin to explore this question, but first we must ask ourselves what is "risk" in the world of finance? For the purposes of this piece, we define financial risk (otherwise “risk”) as follows: Financial risk is defined as the probability of losing a certain amount of money over a given time frame.

With risk defined we can move to a systematic study of how well standard deviation has predicted it in the past. We will describe the experimental procedures and tests conducted, provide the results of the baseline tests and include an extended analysis where the baseline study is broken into separate time frames. 

The study
The experimental approach followed in this study is based on determining how accurately standard deviation has projected risk in the past. Since we have defined risk as a probability of losing a certain amount (say X) over a given time frame (say Y), the test for accuracy is straightforward. We can predict the probability associated with a given loss and time frame based on an asset’s historical standard deviation. Then going forward we can determine if such losses occurred as frequently as predicted. 

This is analogous to looking back at all the days the local weatherman predicted a 10% chance of rain with how often it actually rained that next day. Suppose that in giving tomorrow’s forecast he predicted a 10% chance of rain 200 times over the past five years. Now imagine it actually rained 20 of those days -10% of the time. In this case our weatherman has been historically perfectly accurate. If instead it rained 30 days, 15%,  then he was inaccurate. We might also say that in this case he was not conservative because the actual number of occurrences of rain over the predicted would be 1.5 (i.e., 30 over 20). If it actually rained five days, he would be inaccurate, but conservative because the ratio of actual to predicted occurrences would now be 0.25. 

In our study, standard deviation is like the weatherman–but instead of forecasting the chance of rain it might forecast that shares of Coca-Cola (KO) have a 5.6% chance of losing 20% in three months, or that gold futures have a 22% chance of losing 10% over the next month. The goal is to see how accurate these forecasts have been. Does standard deviation tend to over-or-under-predict the chance of loss? 

To find out, 49 markets were selected for the study, encompassing equities, fixed income, emerging markets, commodities, hedge funds and managed futures asset classes. 

Historical monthly return data was used to estimate an asset’s standard deviation, which then was used to project the probability of experiencing certain loss levels at given time horizons (see "Analysis procedure details," here). 

This was compared to actual losses experienced on a forward rolling basis to determine how accurately standard deviation has predicted risk. Loss levels of 10%, 20%, 30%, 40% and 50% at time horizons of 1, 3, 6, 9, and 12 months were investigated.


“Measuring STD DEV,” (below) shows the baseline results of this study. The figure depicts the ratio of actual to predicted occurrences of losses of varying degree at various time horizons. For instance, a loss of 10% or more at a one-month horizon actually occurred 0.87 times, as often as was predicted. Thus, in this case the standard-deviation-based risk model was slightly off on the conservative side because these types of losses did not actually occur as often as predicted. 

Consider now the loss level of 50% at a time horizon of 12 months. We see here a ratio of 0.99, indicating that the number of times a loss of at least 50% occurred at a 12-month horizon matched almost perfectly with the number of times it was predicted to have occurred (a value of 1.00 being a perfect forecast for the markets as a whole).

Now consider the 50% loss level at the three-month horizon. In this case the standard deviation risk estimates are highly inaccurate with such losses occurring at 6.94 times the rate that would be predicted. The results for 50% losses at a one-month horizon are even worse.

A couple of observations are immediately apparent from these results. The data shows that standard deviation risk estimates ranged from being mildly conservative to highly non-conservative depending on the specific loss level and time horizon studied. And results shows that at a given time horizon, standard deviation risk estimates increasingly underestimate loss levels.  They also become less accurate and less conservative for decreasing time horizons.

The trend shown is clearly problematic for risk managers. Apparently, the only risks that can be accurately estimated  somewhat with standard deviation are those that are less worrisome to investors--small losses at long time frames. It is the more problematic catastrophically large and sudden losses that are grossly underestimated by standard deviation. 

This story is one we have heard before: circa 2009 after the global financial crisis sent many markets tumbling. Note that the data in “Measuring STD DEV” covers the entire 1985-2013 time frame, which includes the crisis. So this leads to more questions:

  • Is the trend shown in “Measuring STD DEV” an aberration associated with the events of 2007-2008? Is standard deviation a poor estimator of risk in more "normal" times?
  • Is this trend consistent across differing assets or investment styles? 
  • If large short-term losses are those that are the most underestimated, are there certain investment styles that are prone (or immune) to such losses? 
  • Why or why not? Can an allocator determine which investments are more susceptible to these losses from their historical performance alone?

Addressing all of these questions is beyond the scope of this initial study, but we will begin to explore them here. First off we ask if this trend of inaccuracy in standard deviation risk estimates was evident prior to the 2008 global financial crisis.

Results prior to crisis
We look at market movements over 1-, 3-, 6-, 9- and 12-month periods and losses from 10% to 50% in the period spanning 1985 to 2005. “Sans credit crisis” (below) shows the trend seen previously with larger losses at shorter time frames being underestimated is still present--albeit to a slightly smaller degree with losses of more than 50% at a one-month horizon happening ‘only’ 61 times more often than predicted. We can see that the results from the previous section are certainly not aberrations due solely to the market moves during the global financial crisis. Similar trends were evident in the 20 years prior to 2005. Although the severity with which standard deviation underestimated risk of large losses was certainly most pronounced during the crisis, hints that standard deviation is a poor estimator of the likelihood of large losses were present.


In this study we have evaluated how accurate standard deviation has been as a measure of risk--risk being defined as the chance of losing a certain amount at a given time horizon. We probed monthly market data going back to 1985 for forty-nine markets from a range of investment categories. We then compared how often actual losses occurred against how often they would be predicted to occur using standard deviation as a measure of risk of loss.

Summary of  findings:

  • Standard deviation risk estimates ranged from being mildly conservative to highly non-conservative depending on the specific loss level and time horizon of interest.
  • The data shows that the risk estimates become increasingly nonconservative for increasing loss levels and decreasing time frames.
  • This trend of underestimating large losses, particularly on shorter time horizons, has been present since 1985 for the markets studied and is not attributable solely to the global credit crisis.

As is the case with most research, this topic definitely deserves deeper study. Further investigation could include expanding the list of markets studied and widening the time frame. 

Also it is probable that many investments that have lost 50% or more ceased to exist before this study was conducted (i.e., survivorship bias). It would be interesting to repeat the analysis going back further with a more extensive data set. Comparison of this same analysis for groups of markets categorized by sector, or trading style; active vs. passive for example or perhaps some statistical measures.

Is there a way to augment one’s understanding/estimation of risk based on information aside from standard deviation? It would be interesting to extend this analysis to other common statistical risk measures such as downside deviation or Value at Risk to see if they offer improvements on the problems with standard deviation?

Scot Billington co-founded Covenant Capital Management, a boutique CTA that has been managing client assets for more than 15 years. Rob Matthews is CCM's Director of Research. Reach them at

SIDEBAR: Analysis Procedure Details

Standard deviation alone does not lead to an estimation of the probability of loss for an asset. It must be couched in a market model. Geometric Brownian motion is a common tool by which financial practitioners translate standard deviation of returns into risk of loss. The model asserts that the price of an asset at time t is:

Where St is the price of the asset at time, t, S0 is the asset’s price at time 0 (the time at which we are projecting the risk of loss forward),  μ and σ are the asset’s drift and standard deviation. Z is a standard normal random variate – thus the price St is represented statistically rather than in a deterministic fashion. We can transform this into a probability of loss of a certain amount at a certain time horizon by first solving for Z which yields:

We can then replace (S_t/S_0) with (1-L) where L is the loss we are interested in analyzing. For instance a 40% loss (i.e. L=0.4) means that (1-L)=0.6 which is equal to (S_t/S_0) (i.e. the price at time t, St, is 60% of the initial price S0). Furthermore, we can replace the normal random variate Z with Φ-1 (p) – the standard normal inverse CDF evaluated at the probability, p

We can then solve for the probability p as: 

which gives us the probability of observing a loss of L (or more) at the (specific) time horizon t given the market drift and standard deviation μ and σ.

Since we are interested in evaluating standard deviation only we set the drift, μ, to zero in our analysis. Accordingly, we also adjust the actual return data for all the markets studied so that the long term drift is also zero historically.


Scot Billington co-founded Covenant Capital Management, a boutique CTA that has been managing client assets for more than 15 years. Rob Matthews is CCM's Director of Research. Reach them at

About the Author

Rob Matthews develops technical trading strategies as Research Director for Covenant Capital Management.