In finance, a correlation matrix is a matrix that shows the correlation between different variables. It is a powerful tool for analyzing the relationships between different stocks or other financial instruments. In this blog, we will explore how to build a correlation matrix using Python Pandas and Seaborn.
To begin, we will use the yfinance
library to download historical stock price data for a set of tickers. We will then extract the closing prices from this data and compute the correlation matrix using Pandas. Finally, we will use Seaborn to visualize the correlation matrix as a heatmap.
Step 1: Download Historical Stock Price Data
First, we need to download historical stock price data for a set of tickers. We will use the yfinance
library to do this. The yfinance
library provides a simple way to download historical stock price data from Yahoo Finance.
We start by importing the yfinance
library and defining the list of tickers we want to analyze. For this example, we will analyze 10 tickers. Next, we can use the yf.download()
function to download the historical stock price data for these tickers. We set the period
parameter to 20 to download last 20 days of historical data for all the ticker symbols.
import yfinance as yf
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# define the list of tickers
tickers = ['RELIANCE.NS', 'SBIN.NS', 'TCS.NS', 'INFY.NS', 'HDFCBANK.NS', 'HDFC.NS', 'MARUTI.NS', 'TATASTEEL.NS', 'ONGC.NS', 'ITC.NS']
# download historical stock price data from Yahoo Finance
data = yf.download(tickers, period='20d', group_by='ticker')
Step 2: Extract Closing Prices
Next, we need to extract the closing prices from the historical stock price data. We will do this by creating a new DataFrame that contains only the closing prices for each ticker.
# extract the closing prices
close_prices = pd.DataFrame()
for ticker in tickers:
close_prices[ticker] = data[ticker]['Close']
This creates a new DataFrame called close_prices
that contains only the closing prices for each ticker. The DataFrame has one column for each ticker and one row for each date.
Step 3: Compute the Correlation Matrix
Now that we have the closing prices for each ticker, we can compute the correlation matrix using Pandas. Pandas provide a corr()
method that computes the correlation matrix for a DataFrame.
# compute the correlation matrix
corr_matrix = close_prices.corr()
This computes the correlation matrix for the close_prices
DataFrame. The resulting corr_matrix
DataFrame contains the correlation coefficients between each pair of tickers.
Step 4: Visualize the Correlation Matrix as Heatmap
Finally, we can visualize the correlation matrix as a heatmap using Seaborn. Seaborn provides a heatmap()
function that creates a heatmap of a matrix.
# create a heatmap of the correlation matrix using Seaborn
sns.set(style='white')
fig, ax = plt.subplots(figsize=(10, 10))
sns.heatmap(corr_matrix, annot=True, cmap='RdYlGn', vmin=-1, vmax=1, ax=ax)
# set the axis labels and title
ax.set_xlabel('Stock Tickers')
ax.set_ylabel('Stock Tickers')
ax.set_title('10x10 Correlation Matrix of Stock Prices (20-day period)')
# display the plot
plt.show()
Here is the Complete Python code for Correlation Matrix
import yfinance as yf
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# define the list of tickers
tickers = ['RELIANCE.NS', 'SBIN.NS', 'TCS.NS', 'INFY.NS', 'HDFCBANK.NS', 'HDFC.NS', 'MARUTI.NS', 'TATASTEEL.NS', 'ONGC.NS', 'ITC.NS']
# download historical stock price data from Yahoo Finance
data = yf.download(tickers, period='20d', group_by='ticker')
# extract the closing prices
close_prices = pd.DataFrame()
for ticker in tickers:
close_prices[ticker] = data[ticker]['Close']
# compute the correlation matrix
corr_matrix = close_prices.corr()
# create a heatmap of the correlation matrix using Seaborn
sns.set(style='white')
fig, ax = plt.subplots(figsize=(10, 10))
sns.heatmap(corr_matrix, annot=True, cmap='RdYlGn', vmin=-1, vmax=1, ax=ax)
# set the axis labels and title
ax.set_xlabel('Stock Tickers')
ax.set_ylabel('Stock Tickers')
ax.set_title('10x10 Correlation Matrix of Stock Prices (20-day period)')
# display the plot
plt.show()