Rajandran R Creator of OpenAlgo - OpenSource Algo Trading framework for Indian Traders. Telecom Engineer turned Full-time Derivative Trader. Mostly Trading Nifty, Banknifty, High Liquid Stock Derivatives. Trading the Markets Since 2006 onwards. Using Market Profile and Orderflow for more than a decade. Designed and published 100+ open source trading systems on various trading tools. Strongly believe that market understanding and robust trading frameworks are the key to the trading success. Building Algo Platforms, Writing about Markets, Trading System Design, Market Sentiment, Trading Softwares & Trading Nuances since 2007 onwards. Author of Marketcalls.in

Introduction to Pandas DataFrame – Python Tutorial for Traders – Part 1

3 min read

Pandas is an open-source Python library that is widely used for data manipulation and analysis. One of the most popular features of Pandas is the DataFrame. It is a two-dimensional table-like data structure that allows you to store and manipulate data in a way that is similar to a spreadsheet.

In this tutorial, we will be learning how to retrieve the stock data in pandas dataframe format, dataframe structure, selecting columns, selecting rows, filtering data, adding a column, removing a column, retrieving current trading price & prev day close price, and visualizing data using a simple line chart.

Reliance Stock Data as Dataframe

A DataFrame consists of rows and columns, where each row represents a unique observation or record, and each column represents a variable or feature of that observation. You can think of a DataFrame as a spreadsheet, where each row is a record, and each column is a field.

In this tutorial, we will learn how to create and manipulate DataFrames using yfinance and plot using seaborn python library.

Prerequisites

Before we begin, you will need to have the following:

  • Python is installed on your computer.
  • Jupyter Notebook (VS code Editor or Google Colab or any other IDE installed).
  • Basic knowledge of Python syntax.

Installing Libraries

use the pip command as shown below to install the python libraries

pip install pandas yfinance seaborn matplotlib

Importing Libraries

Before we create a DataFrame, we need to import the necessary libraries. We will be using Pandas, yfinance seaborn, and matplotlib.

import pandas as pd
import yfinance as yf
import seaborn as sns
import matplotlib.pyplot as plt

Creating a DataFrame

There are several ways to create a DataFrame. In this tutorial, In this tutorial, we will use yfinance to fetch stock historical data and create DataFrames.

Creating a DataFrame from yfinance seaborn

yfinance is a Python library that allows you to download financial data from Yahoo Finance. To create a DataFrame from yfinance , we can use the yf.download() function.

df = yf.download('ZOMATO.NS', start='2021-01-01', end='2023-02-25')

# Display the first 5 rows of the DataFrame
df.head()

Output

This will download the stock data for Zomato from Yahoo Finance and create a DataFrame.

Viewing DataFrame

To view the entire DataFrame, you can use the df function.

df

This will display the entire DataFrame.

Selecting Columns

To select a specific column in the DataFrame, you can use the column name in square brackets.

# Select the 'close' column
df['Close']

Selecting Rows

To select a specific row in the DataFrame, you can use the head function.

Filtering Data

To filter the data in the DataFrame, you can use boolean indexing.

# Filter the data where the 'Close' price is greater than 150
df[df['Close'] > 150]

This will filter the data where the ‘Close’ price is greater than 150.

Adding a Column

To add a new column to the DataFrame, you can use the square bracket notation to assign a new column name and the values for that column.

# Calculate the daily return and add it as a new column
df['Daily Return'] = df['Close'].pct_change()

# Display the first 5 rows of the DataFrame
df.head()

This will calculate the daily return and add it as a new column to the DataFrame.

Print Column Headers

To print the column headers use df.columns

print(df.columns)

Resting the Index of the Dataframe

It seems that the ‘Date’ column is not present in your dataframe. This might be due to the fact that when we download data using the yf.download method, the date column is automatically set as the index of the dataframe.

To confirm this, you can check the index of your dataframe by running:

print(df.index)

This will print the index of the dataframe, which should include dates.

If you want to add a separate ‘Date’ column to your dataframe, you can reset the index of the dataframe using the reset_index() method:

df = df.reset_index()

Visualizing Data

To visualize data in the DataFrame, you can use the sns and plt functions. Ensure Date and Close is present in the column where Date column is used for plotting x-axis and the close column as the y-axis co-ordinates.

# Create a line plot of the 'Close' price
sns.lineplot(x='Date', y='Close', data=df)
plt.show()

Retrieving the Current Close and Previous Closing Value

To retrieve the current close and previous closing value from the dataframe, we can use the iloc function of pandas. The iloc function is used to select rows and columns from the dataframe by integer index positions.

Here’s an example code snippet:

# Retrieve the current close and previous closing value
current_close = df['Close'].iloc[-1]
previous_close = df['Close'].iloc[-2]

# Print the values
print("Current Close:", current_close)
print("Previous Close:", previous_close)

Output

Current Close: 54.45000076293945
Previous Close: 54.95000076293945

In this code, we are first selecting the ‘Close’ column from the dataframe using df['Close']. We then use the iloc function to select the last row (iloc[-1]) and second last row (iloc[-2]) from this column to get the current and previous closing values respectively. Finally, we print the values using the print() function.

Note that the iloc function uses 0-based indexing, so iloc[-1] refers to the last row, iloc[-2] refers to the second last row, and so on.

Alternatively, we can also use

stock = yf.Ticker("ZOMATO.NS")
#retrieve daily data
df = stock.history(period="max", interval="1d")
close = df.Close.round(2)

# Retrieve and print the current close and previous closing value and 5 day back close price
print("Current Price :",close[-1])
print("Previous day closing Price :",close[-2])
print("5 Day Back closing Price :",close[-6])

Output

Current Price : 54.45
Previous day closing Price : 54.95
5 Day Back closing Price : 51.8

Removing Columns

df.drop function is used to remove the dataframe columns. The first print statement will print all the columns of the dataframe, including the ‘Date’ column. The second print statement will print the remaining columns after removing columns 6 and 7 (i.e. ‘Dividends’ and ‘Stock Splits’).

#Retrive Infy Data
stock = yf.Ticker("INFY.NS")

#retrieve daily data
df = stock.history(period="max", interval="1d")

close = df.Close.round(2)

#reset the index
df = df.reset_index()

#print the columns
print(df.columns)

#remove the columns 6 & 7
df.drop(df.columns[[6,7]], axis=1, inplace=True)

#print the remaining columns
print(df.columns)
Rajandran R Creator of OpenAlgo - OpenSource Algo Trading framework for Indian Traders. Telecom Engineer turned Full-time Derivative Trader. Mostly Trading Nifty, Banknifty, High Liquid Stock Derivatives. Trading the Markets Since 2006 onwards. Using Market Profile and Orderflow for more than a decade. Designed and published 100+ open source trading systems on various trading tools. Strongly believe that market understanding and robust trading frameworks are the key to the trading success. Building Algo Platforms, Writing about Markets, Trading System Design, Market Sentiment, Trading Softwares & Trading Nuances since 2007 onwards. Author of Marketcalls.in

[Live Coding Webinar] Build Your First Trading Bridge for…

In this course, you will be learning to build your own trading bridge using Python. This 60-minute session is perfect for traders, Python enthusiasts,...
Rajandran R
1 min read

How to Place Orders Concurrently using ThreadPoolExecutor – Python…

Creating concurrent orders is essential for active traders, especially those handling large funds, as it allows for executing multiple trade orders simultaneously, thereby maximizing...
Rajandran R
2 min read

Host your Python Flask Web Application using pyngrok and…

Ngrok offers several significant advantages for developers, especially when it comes to testing applications or hosting machine learning models. Ngrok allows you to expose...
Rajandran R
1 min read

Leave a Reply

Get Notifications, Alerts on Market Updates, Trading Tools, Automation & More