Voice Commands to Trade on OpenAlgo Platform Using Google Cloud Speech to Text API

Trading platforms are always getting better by using the newest technologies to make things easier and more efficient for users. A great example of this is adding voice command features, which let traders place orders just by voice activated commands. This makes trading faster,handfree and improves the overall experience. In this blog, we’ll show you how to set up a trading system that responds to voice commands, using Google Cloud’s Speech to Text API and OpenAlgo’s trading API.

Sample Voice Command format

MILO BUY 100 RELIANCE
MILO SELL 40 ZOMATO

Setting Up Your Environment

Prerequisites

Before diving into the setup, ensure you have the following prerequisites:

Python 3.x installed
OpenAlgo – Self Hosted & Open Source Algo Platform and OpenAlgo API keys generated
A Google Cloud account with Speech-to-Text API enabled and credentials downloaded

Installing Required Packages

Begin by installing the necessary Python libraries. Open your terminal and execute the following pip commands to install:

pip install pyaudio
pip install google-cloud-speech
pip install python-dotenv
pip install word2number
pip install openalgo

Configuring the Application

Step 1: Environment Setup

Create a .env file in your project directory and populate it with your credentials and other constants:

OPENALGO_API_KEY="your_openalgo_api_key"
OPENALGO_HOST="http://127.0.0.1:5000"
VOICE_ACTIVATE_COMMAND = "MILO"
GOOGLE_APPLICATION_CREDENTIALS="path_to_your_google_credentials.json"

This file will store sensitive information securely and make it easily accessible within your application.

Step 2: Audio Configuration

Set up the audio input which will capture your voice commands. This example uses PyAudio for audio capture:

audio = pyaudio.PyAudio()
stream = audio.open(
    format=pyaudio.paInt16,
    channels=1,
    rate=16000,
    input=True,
    frames_per_buffer=1024,
)

Step 3: Google Cloud Speech Client Setup

Configure the Google Cloud Speech API client using the credentials stored in your .env file:

os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
client = speech.SpeechClient()

Step 4: OpenAlgo Client Configuration

Similarly, set up your OpenAlgo API client:

openalgo_client = api(api_key=os.getenv('OPENALGO_API_KEY'), host=os.getenv('OPENALGO_HOST'))

Implementing Voice Command Recognition

Define a function to continuously read audio data from the microphone and convert it into text using Google’s Speech-to-Text API:

def read_audio_data(requests, stop_event):
    try:
        while not stop_event.is_set():
            data = stream.read(1024, exception_on_overflow=False)
            requests.put(speech.StreamingRecognizeRequest(audio_content=data))
    except Exception as e:
        print(f"Error reading audio data: {str(e)}")

Parse the recognized text to extract trading commands:

def parse_command(transcript):
    words = transcript.upper().split()
    try:
        if voice_activate_command in words:
            action_index = words.index(voice_activate_command) + 1
            action = command_synonyms.get(words[action_index].lower(), words[action_index])
            try:
                quantity = int(words[action_index + 1])
            except ValueError:
                quantity = w2n.word_to_num(words[action_index + 1].lower())
            if not words[-1]:
                print("Error: Trading symbol is missing from the command.")
                return None, None, None
            tradingsymbol = words[-1]
            print(f'Action : {action}')
            print(f'Quantity : {quantity}')
            print(f'Symbol : {tradingsymbol}')
            return action, quantity, tradingsymbol
    except ValueError as ve:
        print(f"Error parsing command, check format: {str(ve)}")
    except IndexError as ie:
        print(f"Error parsing command, parts of the command might be missing: {str(ie)}")
    return None, None, None

Running Your Application

Finally, implement the main function to orchestrate the flow of your application, initializing threads for capturing and processing audio:

def main():
    stop_event = threading.Event()
    requests = queue.Queue(maxsize=10)
    request_thread = threading.Thread(target=lambda: read_audio_data(requests, stop_event), daemon=True)
    result_thread = threading.Thread(target=lambda: handle_results(client.streaming_recognize(
        speech.StreamingRecognitionConfig(config=speech.RecognitionConfig(
            encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,
            sample_rate_hertz=16000,
            language_code="en-UK",
        ), interim_results=True), iter(requests.get, None)), stop_event), daemon=True)
    request_thread.start()
    result_thread.start()

    try:
        while True:
            time.sleep(0.1)
    except KeyboardInterrupt:
        print("Received KeyboardInterrupt, shutting down...")
        stop_event.set()
        request_thread.join()
        result_thread.join()
        stream.stop_stream()
        stream.close()
        audio.terminate()

Complete Python Code to Send Orders using Voice Commands

import pyaudio
import threading
import os
import queue
import time
from google.cloud import speech_v1 as speech
from dotenv import load_dotenv
from word2number import w2n
from openalgo.orders import api

# Load environment variables from .env file
load_dotenv()

# Configure the audio stream
audio = pyaudio.PyAudio()
stream = audio.open(
    format=pyaudio.paInt16,
    channels=1,
    rate=16000,
    input=True,
    frames_per_buffer=1024,
)

# Set up Google Cloud Speech client
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
client = speech.SpeechClient()

# OpenAlgo client setup
openalgo_client = api(api_key=os.getenv('OPENALGO_API_KEY'), host=os.getenv('OPENALGO_HOST'))

# Get the Voice Activation Command
voice_activate_command = os.getenv('VOICE_ACTIVATE_COMMAND')


# Command synonyms to handle speech recognition variations
command_synonyms = {
    "bhai": "BUY",  "bi": "BUY",
    "by": "BUY",    "bye": "BUY",
    "buy": "BUY",   "cell": "SELL",
    "cel": "SELL",  "self": "SELL",
    "sale": "SELL", "sel": "SELL",
    "sell": "SELL"
}



def read_audio_data(requests, stop_event):
    try:
        while not stop_event.is_set():
            data = stream.read(1024, exception_on_overflow=False)
            requests.put(speech.StreamingRecognizeRequest(audio_content=data))
    except Exception as e:
        print(f"Error reading audio data: {str(e)}")

def parse_command(transcript):
    words = transcript.upper().split()
    try:
        if voice_activate_command in words:
            action_index = words.index(voice_activate_command) + 1
            action = command_synonyms.get(words[action_index].lower(), words[action_index])
            try:
                quantity = int(words[action_index + 1])
            except ValueError:
                quantity = w2n.word_to_num(words[action_index + 1].lower())
            if not words[-1]:
                print("Error: Trading symbol is missing from the command.")
                return None, None, None
            tradingsymbol = words[-1]
            print(f'Action : {action}')
            print(f'Quantity : {quantity}')
            print(f'Symbol : {tradingsymbol}')
            return action, quantity, tradingsymbol
    except ValueError as ve:
        print(f"Error parsing command, check format: {str(ve)}")
    except IndexError as ie:
        print(f"Error parsing command, parts of the command might be missing: {str(ie)}")
    return None, None, None

def handle_results(responses, stop_event):
    try:
        for response in responses:
            if response.results and not stop_event.is_set():
                for result in response.results:
                    if result.is_final:
                        transcript = result.alternatives[0].transcript
                        print(f"Voice Command: {transcript}")
                        action, quantity, tradingsymbol = parse_command(transcript)
                        if all([action, quantity, tradingsymbol]):
                            place_order(action, quantity, tradingsymbol)
    except Exception as e:
        print(f"Error handling results: {str(e)}")

def place_order(action, quantity, tradingsymbol):
    response = openalgo_client.placeorder(
        strategy="VoiceOrder",
        symbol=tradingsymbol,
        action=action,
        exchange="NSE",
        price_type="MARKET",
        product="MIS",
        quantity=quantity
    )
    print(f"Order placed: {response}")

def main():
    stop_event = threading.Event()
    requests = queue.Queue(maxsize=10)
    request_thread = threading.Thread(target=lambda: read_audio_data(requests, stop_event), daemon=True)
    result_thread = threading.Thread(target=lambda: handle_results(client.streaming_recognize(
        speech.StreamingRecognitionConfig(
            config=speech.RecognitionConfig(
                encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,
                sample_rate_hertz=16000,
                language_code="en-UK",
            ),
            interim_results=True,
        ), iter(requests.get, None)), stop_event), daemon=True)
    request_thread.start()
    result_thread.start()

    try:
        while True:
            time.sleep(0.1)
    except KeyboardInterrupt:
        print("Received KeyboardInterrupt, shutting down...")
        stop_event.set()
        request_thread.join()
        result_thread.join()
        stream.stop_stream()
        stream.close()
        audio.terminate()
        exit()

if __name__ == '__main__':
    main()

Limitations:

Google Cloud’s Speech to Text API has limited free quota. If you are exceeding then you will be billed for additional usages beyond free quota limits.
The code has only been tested with equity markets during off-market hours.
It may not work for all equity symbols.
The code is not yet optimized for trading futures and options.
There may be some bugs. Make sure to test the code carefully before fully implementing it.

Sample Voice Commands

Voice Command:  Milo by 20 reliance
Action : BUY
Quantity : 20
Symbol : RELIANCE
Order placed: {'orderid': '240421000000190', 'status': 'success'}
Voice Command:  Milo by 20 TCS
Action : BUY
Quantity : 20
Symbol : TCS
Order placed: {'orderid': '240421000000191', 'status': 'success'}
Voice Command:  Milo sel 20 TCS
Action : SELL
Quantity : 20
Symbol : TCS
Order placed: {'orderid': '240421000000192', 'status': 'success'}
Voice Command:  Milo by 100 reliance
Action : BUY
Quantity : 100
Symbol : RELIANCE
Order placed: {'orderid': '240421000000193', 'status': 'success'}
Voice Command:  Milo cell hundred reliance
Action : SELL
Quantity : 100
Symbol : RELIANCE
Order placed: {'orderid': '240421000000194', 'status': 'success'}
Voice Command:  Milo by hundred zomato
Action : BUY
Quantity : 100
Symbol : ZOMATO
Order placed: {'orderid': '240421000000195', 'status': 'success'}
Voice Command:  Milo cel hundred zomato
Action : SELL
Quantity : 100
Symbol : ZOMATO
Order placed: {'orderid': '240421000000196', 'status': 'success'}

Voice command functionality not only enhances the trading experience by reducing the time needed to execute trades but also introduces a new level of accessibility and convenience for traders. By following the steps outlined in this blog, you can set up a voice-activated trading system on the OpenAlgo platform using the Google Cloud Speech to Text API, bringing efficiency and innovation to your trading strategies. Happy trading!