Pandas is an open-source Python library that is widely used for data manipulation and analysis. One of the most popular features of Pandas is the DataFrame. It is a two-dimensional table-like data structure that allows you to store and manipulate data in a way that is similar to a spreadsheet.

In this tutorial, we will be learning how to retrieve the stock data in pandas dataframe format, dataframe structure, selecting columns, selecting rows, filtering data, adding a column, removing a column, retrieving current trading price & prev day close price, and visualizing data using a simple line chart.

A DataFrame consists of rows and columns, where each row represents a unique observation or record, and each column represents a variable or feature of that observation. You can think of a DataFrame as a spreadsheet, where each row is a record, and each column is a field.

In this tutorial, we will learn how to create and manipulate DataFrames using yfinance and plot using seaborn python library.

Prerequisites

Before we begin, you will need to have the following:

Python is installed on your computer.
Jupyter Notebook (VS code Editor or Google Colab or any other IDE installed).
Basic knowledge of Python syntax.

Installing Libraries

use the pip command as shown below to install the python libraries

pip install pandas yfinance seaborn matplotlib

Importing Libraries

Before we create a DataFrame, we need to import the necessary libraries. We will be using Pandas, yfinance seaborn, and matplotlib.

import pandas as pd
import yfinance as yf
import seaborn as sns
import matplotlib.pyplot as plt

Creating a DataFrame

There are several ways to create a DataFrame. In this tutorial, In this tutorial, we will use yfinance to fetch stock historical data and create DataFrames.

Creating a DataFrame from yfinance seaborn

yfinance is a Python library that allows you to download financial data from Yahoo Finance. To create a DataFrame from yfinance , we can use the yf.download() function.

df = yf.download('ZOMATO.NS', start='2021-01-01', end='2023-02-25')

# Display the first 5 rows of the DataFrame
df.head()

Output

This will download the stock data for Zomato from Yahoo Finance and create a DataFrame.

Viewing DataFrame

To view the entire DataFrame, you can use the df function.

df

This will display the entire DataFrame.

Selecting Columns

To select a specific column in the DataFrame, you can use the column name in square brackets.

# Select the 'close' column
df['Close']

Selecting Rows

To select a specific row in the DataFrame, you can use the head function.

Filtering Data

To filter the data in the DataFrame, you can use boolean indexing.

# Filter the data where the 'Close' price is greater than 150
df[df['Close'] > 150]

This will filter the data where the ‘Close’ price is greater than 150.

Adding a Column

To add a new column to the DataFrame, you can use the square bracket notation to assign a new column name and the values for that column.

# Calculate the daily return and add it as a new column
df['Daily Return'] = df['Close'].pct_change()

# Display the first 5 rows of the DataFrame
df.head()

This will calculate the daily return and add it as a new column to the DataFrame.

Print Column Headers

To print the column headers use df.columns

print(df.columns)

Resting the Index of the Dataframe

It seems that the ‘Date’ column is not present in your dataframe. This might be due to the fact that when we download data using the yf.download method, the date column is automatically set as the index of the dataframe.

To confirm this, you can check the index of your dataframe by running:

print(df.index)

This will print the index of the dataframe, which should include dates.

If you want to add a separate ‘Date’ column to your dataframe, you can reset the index of the dataframe using the reset_index() method:

df = df.reset_index()

Visualizing Data

To visualize data in the DataFrame, you can use the sns and plt functions. Ensure Date and Close is present in the column where Date column is used for plotting x-axis and the close column as the y-axis co-ordinates.

# Create a line plot of the 'Close' price
sns.lineplot(x='Date', y='Close', data=df)
plt.show()

Retrieving the Current Close and Previous Closing Value

To retrieve the current close and previous closing value from the dataframe, we can use the iloc function of pandas. The iloc function is used to select rows and columns from the dataframe by integer index positions.

Here’s an example code snippet:

# Retrieve the current close and previous closing value
current_close = df['Close'].iloc[-1]
previous_close = df['Close'].iloc[-2]

# Print the values
print("Current Close:", current_close)
print("Previous Close:", previous_close)

Output

Current Close: 54.45000076293945
Previous Close: 54.95000076293945

In this code, we are first selecting the ‘Close’ column from the dataframe using df['Close']. We then use the iloc function to select the last row (iloc[-1]) and second last row (iloc[-2]) from this column to get the current and previous closing values respectively. Finally, we print the values using the print() function.

Note that the iloc function uses 0-based indexing, so iloc[-1] refers to the last row, iloc[-2] refers to the second last row, and so on.

Alternatively, we can also use

stock = yf.Ticker("ZOMATO.NS")
#retrieve daily data
df = stock.history(period="max", interval="1d")
close = df.Close.round(2)

# Retrieve and print the current close and previous closing value and 5 day back close price
print("Current Price :",close[-1])
print("Previous day closing Price :",close[-2])
print("5 Day Back closing Price :",close[-6])

Output

Current Price : 54.45
Previous day closing Price : 54.95
5 Day Back closing Price : 51.8

Removing Columns

df.drop function is used to remove the dataframe columns. The first print statement will print all the columns of the dataframe, including the ‘Date’ column. The second print statement will print the remaining columns after removing columns 6 and 7 (i.e. ‘Dividends’ and ‘Stock Splits’).

#Retrive Infy Data
stock = yf.Ticker("INFY.NS")

#retrieve daily data
df = stock.history(period="max", interval="1d")

close = df.Close.round(2)

#reset the index
df = df.reset_index()

#print the columns
print(df.columns)

#remove the columns 6 & 7
df.drop(df.columns[[6,7]], axis=1, inplace=True)

#print the remaining columns
print(df.columns)

Introduction to Pandas DataFrame – Python Tutorial for Traders – Part 1

Prerequisites

Installing Libraries

Importing Libraries

Creating a DataFrame

Creating a DataFrame from yfinance seaborn

Viewing DataFrame

Selecting Columns

Selecting Rows

Filtering Data

Adding a Column

Print Column Headers

Visualizing Data

Retrieving the Current Close and Previous Closing Value

Removing Columns

Related

How I Built a Telegram AI Stock Assistant Using…

[Course] Building Stock Market Based Telegram Bots using Python

Understanding Object-Oriented Programming (OOP) Concepts in Python for Traders…

Leave a ReplyCancel reply