58 min read

Stable Simulation

Most of us learn statistics wrong

I didn’t learn statistics properly and based on the questions I get from most investment professionals when I talk about using a stable distribution framework for Monte Carlo simulation I believe I have plenty of company. Most of us think of the Gaussian or Normal distribution as the basic way distributions work and any other distribution as a recondite variation. So when they hear stable distribution they think it’s some sort of odd looking “other” distribution like the Cauchy or Weibull distribution. However, a Normal distribution is actually a very special type of stable distribution with a nice closed-end density formula and straight forward parametric estimations (i.e., mean and standard deviation). Stable distributions are best thought of a class of distributions first characterized by Paul Levy in the 1920s. The few special stable distributions that have elegant closed-end density and distribution functions are the ones named after German and French math giants you’ve probably heard of: Gauss, Cauchy, Levy.

The investment community thinks in terms of mean and standard deviation. Mean and standard deviation are fine, but most of us are never taught that there are four parameters to stable distributions and the Normal distribution just happens to be a unique case where two of these parameters are always \(2\) and \(0\).

Most investment professionals have some level of exposure to the famous Normal density function:


And can recognize the Normal distribution as \(N(\mu, \sigma)\). But a more thorough notation of the Normal distribution would be \(S(\alpha=2, \beta=0, \gamma\sqrt2, \delta)\). Where \(\alpha\) is an index parameter that describes the the thickness of the tails. \(\alpha=2\) is the specific case of the tails being Normal (or not thick at all). A lower \(\alpha\) indicates thicker tails. \(\beta\) is the skewness parameter, and \(\beta=0\) is the special case of no skewness or symmetry that Normal distributions have. \(\gamma\) is the scale parameter and is similar to standard deviation or Normal’s \(\sigma\) except that \(\gamma = \frac{\sigma}{\sqrt2}\). Finally, \(\delta\) is the location parameter that is similar to mean or \(\mu\).

Motivation to Use Stable Distributions

The simplicity of the Normal distribution is elegant. With just two parameters that we can easily estimate with well known average and standard deviation formulas we can make probability statements about the entire distribution. For example if we estimate the mean of a portfolio to be 10% and the standard deviation to be 15%, we can plug these two numbers into the density function and say the probability of a 20% or greater loss is 2.3%.

round(pnorm(q = -0.20, mean = 0.10, sd = 0.15), 3)
## [1] 0.023

It would be great if we could use the Normal distribution to model investment assets. But we can’t, or shouldn’t. The rub is that financial time-series are often not normal. Equity is a classic example: equity returns have negative skewness and fatter than Normal, \(\alpha=2\), tails. A Normal model of equity returns greatly underestimates large drawdowns: Black Tuesday of 1929, the 1987 crash, and October 2008 are all virtually impossible under a Normal distribution assumption.

Let’s take a look at the S&P 500 returns and see for ourselves.

Required Packages

library(quantmod)    # for downloading financial time-series
library(stabledist)  # for stable stimulation
library(fBasics)     # for fitting the stable distribution parameters
library(plotly)      # for interactive plotting
library(reshape2)    # for data manipulation

Get S&P 500 ETF Data

spy_price <- getSymbols("SPY", 
                        from = "1970-01-01", 
                        to = "2018-04-23", 
                        periodicity = "monthly", 
                        src = "yahoo", 
                        auto.assign = FALSE)[, 6]
spy_ret <- spy_price / lag.xts(spy_price, k = 1) - 1
spy_ret <- spy_ret[2:NROW(spy_ret), ]

Quick check to make sure our time-series looks okay.

dat <- data.frame(cumprod(1 + spy_ret)) %>% round(2)
dat$t <- as.Date(index(spy_ret))
colnames(dat) <- c("index", "t")
g <- ggplot(dat, aes(x = t, y = index)) + geom_line()

Looks like the S&P 500 to me, let’s proceed.

Fit the Distribution

emp_density <- density(spy_ret)
emp_density_val <- emp_density$y
spy_quantiles <- emp_density$x

norm_mu <- mean(spy_ret)
norm_sigma <- sd(spy_ret)

norm_density <- dnorm(spy_quantiles, norm_mu, norm_sigma)

stable_fit <- stableFit(as.numeric(spy_ret), type = "q", doplot = FALSE)
s_alpha <- stable_fit@fit$estimate[1]
s_beta <- stable_fit@fit$estimate[2]
s_gamma <- stable_fit@fit$estimate[3]
s_delta <- stable_fit@fit$estimate[4]

stable_density <- dstable(spy_quantiles, s_alpha, s_beta, s_gamma, s_delta)

dat <- data.frame(spy_quantiles, emp_density_val, norm_density, stable_density)
colnames(dat) <- c("Quantile", "Emperical.Density", "Normal.Density", 
plotdat <- melt(dat, id = "Quantile")

g <- ggplot(plotdat, aes(x = Quantile, y = value, col = variable)) + 
  geom_line() + 
  ylab("Density") + labs(col = "") + 
  ggtitle("S&P 500 ETF (SPY) Monthly Returns - 1993 to 2017") + 
  scale_x_continuous(labels = scales::percent, breaks = seq(-20, 20, 5)/100)

Go ahead and zoom in on the left tail. The Normal distribution underestimates the really bad months, while the more general stable distribution is able to model the extreme events.

spy_ret_oct <- spy_ret[index(spy_ret) == "2008-10-01", 1]       
oct_ret_prob_norm <- pnorm(as.numeric(spy_ret_oct), norm_mu, norm_sigma)
norm_months <- ceiling(1 / oct_ret_prob_norm)
norm_months_out <- formatC(norm_months, format = "f", digits = 0)
norm_years <- round(norm_months / 12, digits = 0)

oct_ret_prob_stable <- pstable(as.numeric(spy_ret_oct), 
                               s_alpha, s_beta, s_gamma, s_delta)
stable_months <- ceiling(1 / oct_ret_prob_stable)
stable_months_out <- formatC(stable_months, format = "f", digits = 0)

How about October 2008? With a Normal model we would need to simulate 45658 months or roughly 3805 years before we would expect to see one month as bad as October 2008. Under the stable distribution we would only need 97 months.

Forward Looking

The astute investment professionals will be thinking, “What about forward looking returns, how the hell am I going to predict these four parameters?” \(\delta\) and \(\gamma\) are fairly straight forward. Whatever method you prefer for predicting means and standard deviations you can translate \(\delta = \mu\) and \(\gamma = \sigma / \sqrt2\). My best answer for predicting \(\alpha\) and \(\beta\) is to calibrate using historical returns. 1993 to 2017 is a good sample period with the several bull and bear cycles, but longer equity time-series are readily available. Robert Shiller has monthly S&P data going back to 1871 here.

For the limited S&P 500 returns we’re using we can at least take a look at a 10 year rolling estimation of \(\alpha\) and \(\beta\).

spy_ret_array <- as.numeric(spy_ret)
alpha_array <- rep(NA, length(spy_ret) - 119)
beta_array <- rep(NA, length(spy_ret) - 119)
for (i in 120:length(spy_ret_array)) {
  s_fit <- stableFit(spy_ret_array[i:(i - 119)], doplot = FALSE)
  alpha_array[i - 119] <- s_fit@fit$estimate[1]
  beta_array[i - 119] <- s_fit@fit$estimate[2] 

dat <- data.frame(as.Date(index(spy_ret))[120:NROW(spy_ret)], 
                  alpha_array, beta_array)  
colnames(dat) <- c("t", "alpha", "beta")
g <- ggplot(dat, aes(x = t, y = alpha)) + 
  geom_line() + ggtitle("Rolling 10 year alpha estimation")

Alpha doesn’t look too bad. It moves around over time but we can estimate it without too much anxiety. Beta however, …

g <- ggplot(dat, aes(x = t, y = beta)) + 
  geom_line() + ggtitle("Rolling 10 year beta estimation")

moves around quite a bit. It’s always negative, but varies from negative 0.25 to negative 0.75, a 3x difference. The 1993 to 2017 time-frame does capture some extreme markets, so we shouldn’t be too surprised our distribution parameters change over time. Taking a mid-point or long-term average for beta wouldn’t be too egregious, but I would recommend looking at longer time-series (which is beyond the scope of this experiment).

Further reading

John Nolan at American University has written extensively on stable distributions, and even has a paper for their application in financial data modeling. http://fs2.american.edu/jpnolan/www/stable/stable.html

James Xiong and Thomas Idzorek wrote a great paper on using the stable distribution framework for portfolio optimization. The paper is behind a paywall but you can find a free webinar on the CFA website. https://www.cfainstitute.org/learning/products/multimedia/pages/53057.aspx

Xiong also wrote a solid paper before the asset allocation study about using the stable framework to estimate equity downside risk. https://corporate.morningstar.com/ib/documents/MethodologyDocuments/IBBAssociates/UsingTruncatedLevyFlightToEstimateDownsideRisk.pdf