Positional Option Trading (Wiley Trading) Page 7 Read online free by Euan Sinclair

Home > Other > Positional Option Trading (Wiley Trading) > Page 7

Positional Option Trading (Wiley Trading) Page 7

(2013) and for more detail refer to Poon (2005). Here, I have two general observations.

The GARCH Family and Trading

The simplest forecasting model is to assume that the volatility over

the next N days will be the same as it was over the previous N

days. Mathematically,

(3.1)

This has two major problems. First is the “windowing” effect

where a large single return affects the volatility calculation for N

55

days, then drops out of the sample. This creates jumps in the

volatility measurements and hence the forecast. An example is

given in Figure 3.1, where we calculate the 30-day volatility of

Maximus, Inc. (MMS) from June, 15, 2019, to September, 30,

2019.

FIGURE 3.1 The rolling 30-day close-to-close volatility of Maximus, Inc.

The typical daily move of this stock was about 0.7% but on August

8 it jumped by 12% because of earnings. This caused the 30-day

volatility to jump from 17.8% to 39.3%. Thirty days later, the

earnings day dropped out of the calculation and volatility again

dropped to 23.3%. If we can know what events are outliers, we can

avoid this problem by removing them from the data. We can just

throw out the earnings day return.

A bigger problem is that this method doesn't take volatility

clustering into account. Periods of exceptionally high or low

volatility will persist for only a short time. The exponentially

weighted moving average (EWMA) model takes this into account.

This says variance evolves as

(3.2)

where λ is usually chosen to be between 0.9 and 1.

The GARCH (generalized autoregressive conditional

heteroskedasticity) family of models extend this idea to allow for

56

mean reversion to the long-term variance. The GARCH(1,1) model

(so-called because it contains only first-order lagged terms) is

(3.3)

where α, β, and γ sum to 1 and γV is the long-term variance.

GARCH is both an insightful improvement on naively assuming

future volatility will be like the past and also wildly overrated as a

predictor. GARCH models capture the essential characteristics of

volatility: volatility tomorrow will probably be close to what it is

today and volatility in the long term will probably be whatever the

historical long-term average has been. Everything in between is

interpolation, and it is in the interpolation that the models in the

family differ. As an example, Figure 3.2 shows the term structure of forecast volatility for SPY on August 1, 2019, using GARCH(1,1)

and GJR-GARCH(1,1), which also accounts for the asymmetry of

positive and negative returns. Both models are estimated from the

previous four years of daily returns using MLE.

From a practical perspective, the difference is negligible. And this

is what has led to the proliferation of GARCH-type models. They

are all roughly the same. No model is clearly better than the

others. In any situation where there are many competing theories

it is a sign that all of the theories are bad. There is one

Schrödinger equation. It works very well. There are thousands of

GARCH variants. None work very well.

In fact, it has been shown that the forecasts from GARCH

generally are no better than a simple EWMA model, and most

professional traders are reluctant to use GARCH. Part of the

reticence is due to the instability of the MLE parameters. These

can change considerably week to week. MLE also requires about a

thousand data points to get a good estimate. This means that if we

are using daily data, our forecast will be using information from

four years ago. This isn't good.

But there is a practical way to combine the robustness of EWMA

and the decay to a long-term mean that GARCH allows. When a

trader uses EWMA, he arbitrarily chooses the decay parameter

instead of fitting to historical data and using MLE. We can do the

same with GARCH. Choose a model, choose the parameters, and

use it consistently. This means that eventually we will develop

intuition by “seeing” the market through the lens of this model.

57

For indices, choosing α in the range of 0.9 and β between 0.02 and 0.04 seems to work.

FIGURE 3.2 Term structure of forecast volatility for SPY using GARCH(1,1) (solid line) and GJR-GARCH (dashed line).

Implied Volatility as a Predictor

Implied volatility can be used to predict future realized volatility if

we account for the variance premium. So a forecast of the 30-day

volatility for the S&P 500 would be given by subtracting the

appropriate variance premium for the current VIX level (refer to

Table 4.3) from the VIX.

Most underlying products do not have a calculated VIX index. The

first way to deal with this is to follow the CBOE's published

methodology and construct a VIX. An easier way is to create a

weighted average of the appropriate ATM volatilities and use that

as a proxy. This methodology was used to create the original VIX

(ticker symbol VXO). VXO and the VIX returns have an 88%

correlation and the average difference between their values is

about 0.5% of the VIX level. This approximation isn't ideal but will

usually be the best there is.

Ensemble Predictions

58

The volatility market is now mature enough that any time series–

based volatility method will probably not provide forecasts that

are good enough to profit in the option market. A better approach

is to combine a number of different forecasts. This idea of the

usefulness of information aggregation is far from new. One of the

earliest advocates for the “wisdom of crowds” was Sir Francis

Galton. In 1878, he photographically combined many different

portraits to show that “all the portraits are better looking than

their components because the averaged portrait of many persons

is free from the irregularities that variously blemish the look of

each of them.” His experiment has been repeated and his

conclusions validated numerous times using more advanced

equipment.

An aggregate forecast can be better than any of the components

that make it up. This can be demonstrated with a simple example.

Imagine that we ask 100 people the multiple-choice question,

“What is the capital of Italy?” with the possibilities being Rome,

Milan, Turin, and Venice. Twenty of the group are sure about the

correct answer (Rome). The remaining 80 just guess so their

choices are equally divided among all the choices, which get 20

votes each. So, Rome receives 40 votes (the 20 people who knew

and 20 votes from guesses) and the other cities get 20 votes each.

Even though only a small proportion of the people had genuine

knowledge, the signal was enough to easily swamp the noise from

the guesses of the guessers.

This example also shows that for forecast combinations to be most

useful they need to contain diverse information. We need the

people who are wrong to be uncorrelated sources of noise. That

isn't the case with volatilit
y time series models. Most models will

have very high correlation with each other. However, simply

averaging the predictions from a number of simple models will

still improve predictions. I used five volatility models to predict

subsequent 30-day S&P 500 volatility from 1990 to the end of

2018. Table 3.1 shows the summary statistics for each model and

also for a simple average.

The error of the average is only beaten by that of the simple 30-

day average (the least sophisticated model) but it beats it when we

consider the dispersion of results. Interestingly averaging the 0.9

and 0.95 EWMA models also leads to a slight improvement. This

is shown in Table 3.2.

59

Even very similar models can be usefully averaged. This is

probably the best way to apply this concept. Average over every

GARCH model possible, a wide range of time scales, and a wide

range of parameters. Ideally, the models that are averaged would

be based on totally different ideas or data, but with volatility this

won't happen.

TABLE 3.1 Thirty-Day Volatility Forecasts for the S&P

500 from 1990 to the End of 2018

Averag

30-Day

EWM EWM VIX GARC

e

Historical A ( λ = A ( λ =

H (1,1)

Volatility

0.9)

0.95)

Average

Error

(volatility

0.27

−0.10

−0.92 −0.76

0.30 0.44

points)

SD of

Error

5.2

6.0

5.8

5.9

5.2

5.9

10th

Percentile

−5.1

−5.8

−7.3

6.6

−6.4

−6.3

90th

Percentile

5.2

5.8

4.1

5.0

8.9

5.3

R-Squared

0.65

0.62

0.62

0.60

0.64 0.60

TABLE 3.2 Thirty-Day Volatility EWMA Forecasts for the S&P 500 from 1990 to the End of 2018

Averag EWMA ( λ = EWMA ( λ =

e

0.9)

0.95)

Average Error (volatility

points)

−1.1

−1.5

−0.76

SD of Error

5.7

5.8

5.9

10th Percentile

−6.9

−7.3

6.6

90th Percentile

4.5

4.1

5.0

R-Squared

0.61

0.62

0.60

Conclusion

60

Realized volatility is reasonably forecastable for a financial time

series. Unfortunately, this means that it is hard to make a good

forecast that differs significantly from the market's consensus.

However, volatility predictions are essential even when they are

not the basis for finding edge. In particular, any sensible sizing

scheme will need a prediction of future volatility.

Summary

All trading strategies can be categorized as either model driven

or based on special situations. Each type has weaknesses and

strengths.

An ensemble prediction of volatility will usually outperform

time series methods.

61

CHAPTER 4

The Variance Premium

In finance, everything that is agreeable is unsound and everything that is

sound is disagreeable.

—Winston Churchill

The variance premium (also known as the volatility premium) is the tendency

for implied volatility to be higher than subsequently realized volatility.

This is not a recent phenomenon. In his 1906 book The Put and Call, Leonard Higgins writes how traders on the London Stock Exchange first determine a

statistical fair value for options, then “add to the ‘average value’ of the put and call an amount which will give a fair margin of profit.” That is, a variance

premium was added.

The variance premium exists in equity indices, the VIX, bonds, commodities,

currencies, and many stocks. It is probably the most important factor to be

aware of when trading options. Even traders who are not trying to directly

monetize the effect need to know of it and understand it. It is the tide that long option positions need to overcome to be profitable. Even traders who only use

options to trade directionally need to take this into account. Even if directional predictions are correct, it is very hard to make money if one is consistently

paying too much for options (see Chapters Six and Seven for more discussion of

this point).

This effect can be monetized in many ways. The size and persistence of the

variance premium is so strong that the precise details of a strategy often aren't very important. Practically any strategy that sells implied volatility has a

significant head start on being profitable if the premium is there.

In this chapter we will discuss the characteristics of the variance premium in

various products; look at the relationships among the variance premium,

correlation, and skewness; and give some possible reasons for the existence of

the effect.

Aside: The Implied Variance Premium

The variance premium refers to the difference between implied volatility, which can be defined by either BSM implied volatilities or variance swaps, and

subsequent realized volatility. There is a related phenomenon that occurs

entirely in the implied space. Being short VIX futures is generally a profitable strategy (although not a wildly successful one). Figure 4.1 shows the results of always being short the VIX front month future from June 2015 to October 2019.

The VIX itself doesn't decay in the same way (refer to Figure 4.2). This is really a

term-structure effect in the futures.

62

FIGURE 4.1 Profit from selling 1 front-month VIX future.

FIGURE 4.2 The VIX index from June 2015 to October 2019.

According to the rational expectations hypothesis, the VIX futures curve should

be an unbiased predictor of where the VIX index will be on the expiration date.

The narrowing of the basis as time approaches the expiration date should be

more dependent on the cash index moving toward the future's price.

The theory of rational expectations has been tested on many different

commodity futures and it is generally a poor description of price movements.

Futures tend to move toward the cash. Alternatively, the cash VIX is a better

predictor of future VIX levels than the futures are.

It is probably not surprising that this also occurs in the VIX. VIX futures are

unusual. Generally, futures are priced by first assuming that they are forwards, then constructing an arbitrage-free portfolio of the underlying and the future.

However, the VIX index cannot be traded so this method is not useful for

pricing VIX futures. Given that VIX futures are not constrained by tight, no-

arbitrage bounds, there is even more room for inefficiencies.

On its own this doesn't mean short positions have to be profitable. But the VIX

&nbs
p; term structure is usually in contango (from the time VIX futures were listed in

2006 to the start of 2019, the term structure has been in contango 81% of the

time). This means that the futures are above the cash and tend to decline toward 63

it. The best discussion of this effect is in Simon and Campasano (2014). Selling a

future only when the previous day's prices were in contango considerably

improves this strategy. Figure 4.3 shows the results of being short the VIX

front-month future from June 2015 to October 2019, when the term structure is

in contango.

FIGURE 4.3 Profit from selling 1 front-month VIX future when the term structure is in contango.

Variance Premium in Equity Indices

Figure 4.4 shows the VIX and the subsequent 30-day realized volatility of the S&P 500 from 1990 to the end of 2018.

On average the VIX was four volatility points higher than the realized volatility and the premium is positive 85% of the time. Figure 4.5 shows the premium in

volatility points. Figure 4.6 shows the distribution of the daily premia and Table

4.1 gives the summary statistics.

FIGURE 4.4 The VIX and the subsequent 30-day realized S&P 500 volatility.

64

FIGURE 4.5 The S&P 500 variance premium (VIX minus realized volatility).

FIGURE 4.6 The S&P 500 variance premium distribution.

TABLE 4.1 Summary Statistics for the S&P 500 Variance Premium Mean

4.08

Standard

deviation

5.96

Skewness

−2.33

Maximum

31.21

Minimum

−53.3

4

Median

4.63

90th percentile

9.62

10th percentile

−1.45

The Dow Jones 30, NASDAQ 100, and Russell 2000 indices have similar

variance premia. The summary statistics for these are given in Table 4.2.

65

TABLE 4.2 Summary Statistics for the Dow Jones, NASDAQ 100, and Russell 2000 Variance Premia

Index

Dow Jones

NASDAQ 100

Russell 2000

(from 1998)

(from 2001)

(from 2004)

Mean

3.50

3.41

3.24

Standard

deviation

6.18

6.99

6.58

Skewness

‹ Prev Next ›