Rajandran R Creator of OpenAlgo - OpenSource Algo Trading framework for Indian Traders. Building GenAI Applications. Telecom Engineer turned Full-time Derivative Trader. Mostly Trading Nifty, Banknifty, High Liquid Stock Derivatives. Trading the Markets Since 2006 onwards. Using Market Profile and Orderflow for more than a decade. Designed and published 100+ open source trading systems on various trading tools. Strongly believe that market understanding and robust trading frameworks are the key to the trading success. Building Algo Platforms, Writing about Markets, Trading System Design, Market Sentiment, Trading Softwares & Trading Nuances since 2007 onwards. Author of Marketcalls.in

Slashing API Order Latency upto 70%+ with HTTP Connection Pooling

3 min read

The Challenge: High Latency in Trading API Calls

In trading, every millisecond counts. Recently, we identified a significant performance bottleneck in our OpenAlgo Platform: API calls to openalgo supported broker (Angel One) were taking 250-300ms, despite the broker’s API itself responding in just ~78ms. This overhead was unacceptable for time-sensitive trading operations.

Identifying the Root Cause

After analyzing our network requests, we discovered that the culprit was our use of Python’s built-in http.client library. For each API call, our application was:

  1. Creating a new HTTP connection
  2. Performing a full TCP handshake (14-45ms)
  3. Completing TLS negotiation (50-100ms)
  4. Sending the request
  5. Receiving the response
  6. Closing the connection

This inefficient connection management was adding 170-220ms of unnecessary overhead to each request!

The Solution: HTTP Connection Pooling with httpx

To address this issue, we needed a library that supports connection pooling – the ability to maintain and reuse established connections. We chose httpx for several compelling reasons:

  1. Modern API: Clean, intuitive interface similar to the popular requests library
  2. Connection Pooling: Built-in support for connection reuse
  3. HTTP/2 Support: Enables multiplexing multiple requests over a single connection
  4. Resource Management: Fine-grained control over connection limits and timeouts
  5. Synchronous & Asynchronous Support: Flexibility to use either programming model

HTTP Connection Pooling Explained

What is HTTP Connection Pooling?

HTTP Connection Pooling is a technique that optimizes network performance by maintaining a pool of open, reusable HTTP connections. Instead of establishing a new connection for each HTTP request, the application reuses existing connections from the pool, significantly reducing the overhead associated with connection establishment.

How Connection Pooling Works

  1. Connection Establishment: When the first request is made, a connection is established to the target server, involving DNS resolution, TCP handshake, and TLS negotiation (for HTTPS).
  2. Connection Reuse: After the response is received, the connection is not closed but kept open and placed in a pool.
  3. Subsequent Requests: Future requests to the same server can reuse connections from the pool, bypassing the time-consuming connection establishment steps.
  4. Connection Management: The pool manages connection lifecycle, including:
    • Creating new connections when needed
    • Reusing existing connections when available
    • Validating connection health before reuse
    • Closing idle connections after a configured timeout
    • Enforcing maximum limits on concurrent connections

Implementation Details

1. Creating a Shared httpx Client

We created a central client module in utils/httpx_client.py:

import httpx

# Global httpx client for connection pooling
_httpx_client = None

def get_httpx_client():
    """Returns a global httpx client instance with connection pooling."""
    global _httpx_client
    if _httpx_client is None:
        _httpx_client = httpx.Client(
            http2=True,
            timeout=30.0,
            limits=httpx.Limits(
                max_keepalive_connections=10,
                max_connections=20,
                keepalive_expiry=60.0
            )
        )
    return _httpx_client

2. Replacing http.client Across All API Files

We refactored all broker API functions to use our shared httpx client. For example:

Before (using http.client):

def get_api_response(endpoint, auth, method="GET", payload=''):
    AUTH_TOKEN = auth
    api_key = os.getenv('BROKER_API_KEY')
    
    conn = http.client.HTTPSConnection("apiconnect.angelbroking.com")
    headers = {
        'Authorization': f'Bearer {AUTH_TOKEN}',
        'Content-Type': 'application/json',
    }
    conn.request(method, endpoint, payload, headers)
    res = conn.getresponse()
    data = res.read()
    
    return json.loads(data.decode("utf-8"))

After (using httpx with connection pooling):

def get_api_response(endpoint, auth, method="GET", payload=''):
    AUTH_TOKEN = auth
    api_key = os.getenv('BROKER_API_KEY')
    
    client = get_httpx_client()
    
    headers = {
        'Authorization': f'Bearer {AUTH_TOKEN}',
        'Content-Type': 'application/json',
    }
    
    url = f"https://apiconnect.angelbroking.com{endpoint}"
   
    if method == "GET":
        response = client.get(url, headers=headers)
    elif method == "POST":
        response = client.post(url, headers=headers, content=payload)
    else:
        response = client.request(method, url, headers=headers, content=payload)
    
    response.status = response.status_code
    
    return json.loads(response.text)

This allowed us to refactor the HTTP client implementation without changing the calling code.

The Results: Dramatic Latency Reduction

After implementing these changes, we observed remarkable improvements:

MetricBeforeAfterImprovement
Average order latency250-300ms88-120ms~52-70% reduction
Connection overhead170-220ms~0ms for subsequent requests~100% reduction

Key benefits achieved:

  1. Eliminated TCP Handshake: Saved 14-45ms per request
  2. Eliminated TLS Negotiation: Saved 50-100ms per request
  3. HTTP/2 Multiplexing: Improved concurrent request handling
  4. Better Resource Utilization: More efficient use of system resources

Future Plans

Right now, this implementation is done only for Angel One. Very soon, it will be extended to all the brokers supported by OpenAlgo.

Conclusion

By switching from http.client to httpx with connection pooling, we reduced our API latency by more than 60% – from 250-300ms down to just 110ms. This dramatically enhances the responsiveness of our trading application and proves how small infrastructure changes can lead to significant performance gains.

For trading systems, connection pooling isn’t just an optimization – it’s an essential feature that can mean the difference between executing a profitable trade or missing an opportunity.


This blog post describes performance optimizations implemented in OpenAlgo Platform, a Python-based algorithmic trading platform.

Rajandran R Creator of OpenAlgo - OpenSource Algo Trading framework for Indian Traders. Building GenAI Applications. Telecom Engineer turned Full-time Derivative Trader. Mostly Trading Nifty, Banknifty, High Liquid Stock Derivatives. Trading the Markets Since 2006 onwards. Using Market Profile and Orderflow for more than a decade. Designed and published 100+ open source trading systems on various trading tools. Strongly believe that market understanding and robust trading frameworks are the key to the trading success. Building Algo Platforms, Writing about Markets, Trading System Design, Market Sentiment, Trading Softwares & Trading Nuances since 2007 onwards. Author of Marketcalls.in

Get Notifications, Alerts on Market Updates, Trading Tools, Automation & More