In the first part of this pandas tutorial for traders, we explored creating and manipulating Pandas DataFrames using the yfinance
library to fetch stock data from Yahoo Finance. In this next part, we will delve into another common data source for traders: CSV files containing stock market data. We’ll cover a range of useful Pandas concepts such as resampling, checking for null values, writing DataFrames back to CSV files, and more. By the end of this section, you will be well-equipped to effectively work with stock market data in Pandas, and expand your knowledge of data analysis for trading purposes.
Fetching Data from CSV
To fetch data from a CSV file, we can use the Pandas read_csv() function. Let’s consider the following CSV file named “NIFTY_EOD.csv” which contains the historical stock quotes for Nifty Index.
Sample CSV File Format:
Ticker,Date/Time,Open,High,Low,Close,Volume,PClose,lreg5,lreg7,lreg9,hma5,hma7,hma9,trsi,atr
NIFTY,04-04-2022,17809.1,18114.65,17791.4,18053.4,345526496,17670.45,17883.38,17842.97,17748.73,17933.65,17881.86,17766.89,89.34,281.74
NIFTY,05-04-2022,18080.6,18095.45,17921.55,17957.4,283453600,18053.4,18019.11,18002.16,17935.31,18133.19,18087.98,17965.47,92.37,270.95
NIFTY,06-04-2022,17842.75,17901,17779.85,17807.65,328824416,17957.4,17992.02,18000.8,17998.28,17989.82,18068.39,18054.13,73.2,261.61
NIFTY,07-04-2022,17723.3,17787.5,17623.7,17639.55,308840224,17807.65,17806.38,17898.18,17954.51,17721.23,17856.78,17987.16,42.1,253.85
NIFTY,08-04-2022,17698.15,17842.75,17600.55,17784.35,274447424,17639.55,17684.99,17842.1,17907.89,17649.76,17714.44,17865.56,23,252.68
We can fetch this data into a DataFrame as follows
import pandas as pd
# Read the CSV file into a DataFrame
df = pd.read_csv('NIFTY_EOD.csv')
# Display the recent 10 rows of the DataFrame
df.tail(10)
Resampling Data for Multi Timeframe Analysis
In financial analysis, it’s often useful to resample data to a different time frequency. For example, we might want to convert daily data into monthly data to get a better understanding of the long-term trends. We can use the resample() function to achieve this.
Let’s consider an example where we want to resample our daily data into monthly data. We can do this as follows:
# Convert 'Date/Time' column to a datetime object
df['Date/Time'] = pd.to_datetime(df['Date/Time'], format='%d-%m-%Y')
# Set the 'Date/Time' column as the index
df.set_index('Date/Time', inplace=True)
# Resample the data to monthly frequency
df_monthly = df.resample('M').mean()
# Display the recent 10 rows of the resampled DataFrame fro monthly timeframe
df_monthly.tail(10)
Output:
Calculating the Shape of a DataFrame
To calculate the shape of a DataFrame, you can use the shape attribute. The shape attribute returns a tuple representing the dimensions of the DataFrame.
# Calculate the shape of the DataFrame
print(df.shape)
Output
(6750, 15)
List the columns in the dataframe
To list the columns in a Pandas DataFrame, you can use the columns
attribute.
Here’s an example code snippet:
import pandas as pd
# Load the CSV file into a DataFrame
df = pd.read_csv('NIFTY_EOD.csv')
# List the columns in the DataFrame
print(df.columns)
Output:
Index(['Ticker', 'Date/Time', 'Open', 'High', 'Low', 'Close', 'Volume',
'PClose', 'lreg5', 'lreg7', 'lreg9', 'hma5', 'hma7', 'hma9', 'trsi',
'atr'],
dtype='object')
Compute EMA Values and Write to the values to Same CSV File
To compute EMA (Exponential Moving Average) values based on the Close
column and append them to the same CSV file, you can use the Pandas ewm()
function and the to_csv()
function.
Here’s an example code snippet:
import pandas as pd
# Load the CSV file
df = pd.read_csv('NIFTY_EOD.csv')
# Calculate EMA values based on the Close column
ema5 = df['Close'].ewm(span=5, adjust=False).mean()
ema10 = df['Close'].ewm(span=10, adjust=False).mean()
# Append EMA values as new columns in the DataFrame
df['EMA5'] = ema5
df['EMA10'] = ema10
# Save the updated DataFrame to the same CSV file
df.to_csv('NIFTY_EOD.csv', index=False)
# Display the recent 10 rows of the DataFrame
df.tail(10)
Output:
Cleaning and transforming data:
Often, CSV files may have messy or inconsistent data that needs to be cleaned and transformed before analysis. Pandas provides a wide range of functions for data cleaning, such as fillna()
for filling in missing values.
Checking for Missing Values
It’s important to check for missing values in your data as they can affect your analysis. You can use the isna() or isnull() function to check for missing values. The output will be a DataFrame with True or False values, where True indicates a missing value.
# Check for missing values in the DataFrame
print(df.isna())
# Check for missing values in the 'Close' column
print(df['Close'].isnull())
Output:
As we can see from the output, there are no missing values in our DataFrame.
Writing to a CSV File
We can also write our DataFrame to a CSV file using the to_csv() function.
# Write the DataFrame to a CSV file
df.to_csv('new_file.csv')
Exporting data to CSV or other formats: After manipulating and analyzing data, you may want to export it to a CSV file or other formats such as Excel or JSON. Pandas provides functions such as to_csv()
, to_excel()
, and to_json()
for exporting data.
In this part of the tutorial, we learned how to fetch data from a CSV file containing stock market data and apply various Pandas concepts to manipulate and analyze it. We converted the date/time column to a datetime object, set it as the index, and resampled the data to monthly frequency. We also demonstrated how to check for null values and append calculated values to the same CSV file. Additionally, we covered writing DataFrames back to CSV files and learned how to list the columns in a DataFrame. By the end of this tutorial, you should have a better understanding of how to effectively work with stock market data in Pandas and be better equipped to perform data analysis for trading purposes.