Rajandran R Creator of OpenAlgo - OpenSource Algo Trading framework for Indian Traders. Building GenAI Applications. Telecom Engineer turned Full-time Derivative Trader. Mostly Trading Nifty, Banknifty, High Liquid Stock Derivatives. Trading the Markets Since 2006 onwards. Using Market Profile and Orderflow for more than a decade. Designed and published 100+ open source trading systems on various trading tools. Strongly believe that market understanding and robust trading frameworks are the key to the trading success. Building Algo Platforms, Writing about Markets, Trading System Design, Market Sentiment, Trading Softwares & Trading Nuances since 2007 onwards. Author of Marketcalls.in

How PandasAI and LLM Models Transform Financial Data Analysis

3 min read

PandasAI emerges as a groundbreaking tool in the dynamic landscape of data analysis, offering substantial advancement, particularly for students, novice programmers, fledgling data analysts, fund managers, and OpenAI/LLM enthusiasts.

The core innovation of PandasAI lies in its ability to make data conversation. Unlike traditional methods that require familiarity with specific coding syntax and data analysis concepts, PandasAI simplifies this interaction. Users can ask questions in natural language, and the system intelligently processes these queries.

PandasAI stands out with two key value propositions: ease of use and power. It’s designed for those who might not have deep knowledge of generative AI or pandas, making it an ideal learning tool. However, its capabilities are robust enough to cater to more complex tasks such as data exploration, visualization, cleaning, imputation, and feature engineering.

Installing PandasAI Library

pip install pandasai
pip install pandasai[connectors]

The magic of PandasAI is in its backend, where a generative AI model generates Python code based on the user’s natural language queries. This process involves understanding the query, creating the appropriate code, and executing it to produce results. This seamless process hides the complexities of coding and data manipulation, presenting users with an intuitive and efficient way to interact with data.

Load the Pandas Dataframe

from pandasai import SmartDataframe
import pandas as pd 


# URL of the CSV file
csv_url = 'https://raw.githubusercontent.com/marketcalls/data/main/NIFTY_daily_data.csv'

# Load the CSV file from the URL
data = pd.read_csv(csv_url)

# Convert the date column to datetime for easier calculations
data['date'] = pd.to_datetime(data['date'])


data

Output

	date	symbol	open	high	low	close	volume
0	1990-07-03	NIFTY	279.01999	279.01999	279.01999	279.01999	0.0
1	1990-07-05	NIFTY	284.04001	284.04001	284.04001	284.04001	0.0
2	1990-07-06	NIFTY	289.04001	289.04001	289.04001	289.04001	0.0
3	1990-07-09	NIFTY	289.69000	289.69000	289.69000	289.69000	0.0
4	1990-07-10	NIFTY	288.69000	288.69000	288.69000	288.69000	0.0
...	...	...	...	...	...	...	...
8100	2024-01-09	NIFTY	21653.60000	21724.44900	21517.85000	21544.85000	228573407.0
8101	2024-01-10	NIFTY	21529.30100	21641.85000	21448.65000	21618.69900	216991926.0
8102	2024-01-11	NIFTY	21688.00000	21726.50000	21593.75000	21647.19900	212453866.0
8103	2024-01-12	NIFTY	21773.55100	21928.25000	21715.15000	21894.55100	294678459.0
8104	2024-01-15	NIFTY	22053.15000	22081.95000	22021.10000	22059.90000	53802412.0
8105 rows × 7 columns

Set the OpenAI Apikey

PandasAI leverages a Large Language Model (LLM) for its functionality, so you’ll need to select and import the LLM that best suits your needs. For this example, we’ll be utilizing OpenAI’s capabilities.

To integrate OpenAI with PandasAI, an API token is necessary. Follow the straightforward steps outlined below to create your API_TOKEN with OpenAI.

  1. Go to https://platform.openai.com/apps and signup with your email address or connect your Google Account.
  2. Go to View API Keys on the left side of your Personal Account Settings
  3. Select Create New Secret key

The API access to openai is a paid service. You have to set up billing. Read the Pricing information before experimenting.

Inorder to store the OpenAI Apikey securely used the .env files and stored the key under environmental variable

#set the openAI apikey
import os
from dotenv import load_dotenv
from pandasai.llm import OpenAI

load_dotenv()  # loads the configs from .env

openai_api_key = os.getenv("OPENAI_API_KEY")

llm = OpenAI(api_token=openai_api_key)

Querying the pandas dataframe using Prompts to perform Data Analysis

Prompt 1

#Set the SmartDataframe
sdf = SmartDataframe(data,config={'llm':llm}) 

#Prompt 1
sdf.chat("How many rows are there in data ?")

Output Response

8105

Prompt 2

#Prompt 2
sdf.chat("Get me the Highest value of Nifty from the high column")

Output Response

22081.95

Prompt 3

#Prompt 3
sdf.chat("Get the Last 20 days max high and min low value")

Output Response

'The last 20 days max high value is 22081.95 and min low value is 20976.801.'

Visualizing the Pandas Dataframe using Prompts

#data visualization prompt 1
sdf.chat("Plot the Line Chart of Close in red color")

Output Response

#data visualization prompt 2
sdf.chat("Plot the close chart with volume as subplot for the last 200 days")

Output Response

#data visualization prompt 3
sdf.chat("For the last 100 bars Plot the line chart of close in green color with ema10 and ema20 value")

Output Response

PandasAI Connectors

PandasAI provides several connectors that allow you to connect to different data sources like yahoo finance. These connectors are designed to be easy to use, even if you are not familiar with the data source or with PandasAI.

from pandasai.connectors.yahoo_finance import YahooFinanceConnector

yahoo_connector = YahooFinanceConnector("WIPRO.NS")
df = SmartDataframe(yahoo_connector, config={"llm": llm})

response = df.chat("What is the closing price for yesterday? Provide the output adjusted to 2 decimals")
print(response)

Output Response

The closing price for yesterday is 465.45.
yahoo_connector = YahooFinanceConnector("TATASTEEL.NS")

df_connector = SmartDataframe(yahoo_connector, config={"llm": llm})
response = df_connector.chat("Plot the line chart of Tata Steel over time for the last 200 days")

Output Response

yahoo_connector = YahooFinanceConnector("TATASTEEL.NS")

df_connector = SmartDataframe(yahoo_connector, config={"llm": llm})
response = df_connector.chat("Plot the line chart of Tata Steel over time for the last 200 days")

Output Response

PandasAI’s user-friendly nature makes it an excellent choice for a wide range of users. It’s particularly beneficial for those new to pandas or those seeking to streamline their data analysis workflow. Whether you are a student grappling with data analysis concepts, a beginner in programming looking to delve into data science, or an entry-level data analyst aiming to enhance efficiency, PandasAI is tailored for you.

The introduction of PandasAI signifies a paradigm shift in financial data analysis. Its ability to simplify complex data interactions, coupled with its powerful analytical capabilities, makes it a tool that not only enhances efficiency but also makes data analysis more accessible. As we move forward, tools like PandasAI are set to play a pivotal role in shaping the future of data analysis, making it more inclusive, efficient, and versatile.

OpenAI API Token - Note
PandasAI’s integration with OpenAI’s services necessitates an understanding of the associated costs. For the most current pricing details, OpenAI’s website is the recommended resource. As of January 2024, the rate stands at roughly 1000 tokens for every $0.0010, specifically for the GPT-3.5-Turbo Model. A vital aspect to consider when utilizing PandasAI is that each query involves transmitting the complete dataframe. Consequently, it might not be the most efficient approach for processing extensive datasets.
Rajandran R Creator of OpenAlgo - OpenSource Algo Trading framework for Indian Traders. Building GenAI Applications. Telecom Engineer turned Full-time Derivative Trader. Mostly Trading Nifty, Banknifty, High Liquid Stock Derivatives. Trading the Markets Since 2006 onwards. Using Market Profile and Orderflow for more than a decade. Designed and published 100+ open source trading systems on various trading tools. Strongly believe that market understanding and robust trading frameworks are the key to the trading success. Building Algo Platforms, Writing about Markets, Trading System Design, Market Sentiment, Trading Softwares & Trading Nuances since 2007 onwards. Author of Marketcalls.in

Choosing the Right Python Web Framework for Algo Traders…

For algo traders and individual investors considering whether to learn Django, Flask, or FastAPI, the choice largely depends on their specific goals, expertise, and...
Rajandran R
2 min read

Storing WebSocket Stock Market Tick Data in ClickHouse using…

Real-time market data is essential for traders and investors making informed decisions. This guide will walk you through the process of capturing WebSocket stock...
Rajandran R
17 min read

Introducing OpenChart: A Python Library for NSE and NFO…

Financial data analysis is crucial for traders, investors, and analysts who need to make informed decisions based on historical market trends. Access to reliable...
Rajandran R
3 min read

Leave a Reply

Get Notifications, Alerts on Market Updates, Trading Tools, Automation & More