# Compute Cointegration using NsePy, Pandas Library

Here is a simple example to compute Cointegration between two stock pairs using python libraries like NSEpy, Pandas, statmodels, matplotlib

Cointegration is used in Statistical Arbitrage to find the best Pair of Stocks (Pair Trading) to go long in one stock and short(Competitive peers) in another to generate returns. Statistical Arbitrage(StatArb) is all about mean reversion, looking for deviation in the spreads and expecting mean reversion from the spread.

NSEpy – fetches historical data from nseindia.com
Pandas – Python library to handle time series data
Statmodels – Python library to handle statistical operations like cointegration
Matplotlib – Python library to handle 2D chart plotting

NSEpy is an open-source Python library for collecting historical data from the National Stock Exchange (NSE) of India. The library provides easy-to-use functions for accessing data related to stocks, indices, and derivatives traded on the NSE.

The library utilizes web scraping techniques to retrieve data from the NSE website and provide data in a pandas DataFrame format, which can be easily manipulated and analyzed using Python. The data can be retrieved for various time intervals, ranging from daily to monthly, yearly, or customized intervals.

# Importing the Libraries, Fetching Data from NSE and Computing Cointegration

``````
import numpy as np
import pandas as pd

import statsmodels
from statsmodels.tsa.stattools import coint

import matplotlib.pyplot as plt
import nsepy
from datetime import date

S1 = nsepy.get_history(symbol = 'SBIN',
start = date(2015,1,1),
end = date(2015,10,10))

S2 = nsepy.get_history(symbol = 'ICICIBANK',
start = date(2015,1,1),
end = date(2015,10,10))

result = coint(S1[['Close']], S2[['Close']])
score = result[0]
pvalue = result[1]``````

# Plot State Bank of India Dataframe

``S1[['Close']].plot()``

# Plot ICICI Bank Dataframe

``S2[['Close']].plot()``

## Calculate the p-value

``````score, pvalue, _ = coint(S1[['Close']], S2[['Close']])
pvalue``````

Output

``0.0052518039905594``

``````diff_series= S2[['Close']] - S1[['Close']]
diff_series.plot()``````

## Calculate and Plot the Z-score with +/- 2 SD levels

``````def zscore(series):
return (series - series.mean()) / np.std(series)

zscore(diff_series).plot()
#plt.axhline(zscore(diff_series).mean(), color='black')
plt.axhline(1.0, color='red', linestyle='--')
plt.axhline(-1.0, color='green', linestyle='--')``````

# Simple Strategy using Co-Integration

Go “Long” the spread whenever the z-score is below -1.0
Go “Short” the spread when the z-score is above 1.0
Exit positions when the z-score approaches zero

Since we originally defined the “spread” as S1-S2, “Long” the spread would mean “Buy 1 share of S1, and Sell Short 1 share of S2” (and vice versa if you were going “Short” the spread)

Sample IPython Notebook to compute Cointegration below :

