Cointegration is used in Statistical Arbitrage to find best Pair of Stocks (Pair Trading) to go long in one stock and short(Competitive peers) another to generate returns. Statistical Arbitrage(StatArb) is all about mean reversion, looking for deviation in the spreads and expecting mean reversion from the spread.
So Whats the Problem with Correlation?
Often people use correlation in pair trading to identify high correlated pairs and then expect mean reversion from the spreads. But look into the following example where X and Y are the random time-series data diverging and both moving in the same direction and highly correlated. But do you think we can do a pair trading on top of it where there is no mean reversion among the spreads?
What is Co-Integration
Co-Integration helps in identifying best stock pairs where the spread could revert to mean value. Co-Integration looks for stationary pair where the mean of the spread is fixed. Whenever the spread is deviating from the mean it generates trading opportunity and the spread will possibly revert back to the mean value.
Let me explain with a funny example which explains Co-Integration in a better way. “A drunken man is walking on the road along with his dog chained and tied up with the drunkard’s hand. When the man is drunk and he is expected to walk random and the chained dog is also expected to walk random(assume a small little puppy 🙂 ). The maximum distance between them could be the length of rope holding the chained dog and it is always fixed. Whenever the distance/spread between the Drunken Man and the Dog goes near to the max distance we can expect a mean reversion in the distance to the mean” In simple words the drunken man and the dog both are Co-Integrated.
If two stocks are highly correlated then both the stocks will move in the same direction most of the time however the magnitude of the moves is unknown and spread can keep increasing as long as it could as shown in the above example. However Co-Integration looks for mean reversion in the spread/distance and the spreads are tradeable. Augmented Dicky Fuller test is generally used to identify with a certain level of confidence whether the spread between two stocks or time series is stationary and cointegrated or not.
Augmented Dickey-Fuller (ADF) Test
The Augmented Dicky Fuller test is a hypothesis test that a signal contains a unit root,we want to reject this hypothesis. The test gives a pValue, the lower this number the more confident we can be that we have found a stationary signal. pValues less than 0.5 are considered to be good mean reverting stock pairs. Some of the experts even look for values pValues less than 0.1. pValues above 0.1 are likely to be non statinary and trading such stock pairs are not advisable.
Above image shows Cointegration and Correlation Dashboard between Sun Pharma and Cipla Futures since Dec 2014 to till date which shows the P-Value is 0.05 (Highly Co-Integrated) and also highly Correlated (0.834) and possibly a best pair to look for long term mean reversion in the spread.
The second example showing Infy and TCS Hourly Future charts with High Co-Integration (0.05) and High Correlation (0.843) and possibly a best pair to look for short term mean reversion in the spread.
Computing Co-Integration in Amibroker
Since Co-Integration is a statistical model it is relatively difficult to code in AFL Programming Language we rely on Amibroker with Python COM Server and statistical computing python packages like numpy(to handle arrays), Pandas(to handle time series data) and statsmodels(to do ADF test) where the close arrays of two stock pair are passed from Amibroker and the CoIntegration is computed by python and revert back to Amibroker.
If you are not sure how to install and configure python and its statistical packages like numpy, statsmodels, pandas then go thro the video tutorial here which explains how to install python library zipline – a backtesting package right from the scratch
Steps to follow in Amibroker
1)Download CoIntegration-AFL Set and Unzip it
2)Copy the file coint.py to \\python2.7\\bin\\ folder. And execute the file with the command python coint.py in your command prompt as shown below
3)Copy the PyCoint.afl file and paste the file in \\Amibroker\\Formulas\\Basic Charts Folder
4)Open a New Blank Chart and apply PyCoint.afl to it. Now right click over the charts and goto parameters and enter the two symbols for which you want to compute Co-Integration. You should be able to see Correlation(22 Period) and Co-Integration values displayed in a dashboard which you can use for your further Statistical Arbitrage(Pair Trading) analysis.
Note : Co-Integration is calculated based on the visible data displayed data in the charts. If you do zoom/Unzoom or Changing the timeframe you may end up with Co-Integration value calculated only for the visually displayed price on the screen.