How to make concurrent Web3 RPC calls with Python

Andrei Popa

September 15, 2023 in Tutorials

A little bit of context first for the origin of this article. We had a member of the developer community ask us: how to loop through a large range of blocks, and for each block that we loop through, and iterate through, we have to use that block number as a parameter in a smart contract method call, using Python.

The issue that this developer had was that it was taking too long, it was too slow. After some back-and-forth talking, we realized what the issue was. The issue was Python itself.

The issue

If you didn’t know Python as well as web3.py are largely synchronous, this means that operations are essentially processed sequentially one after the other until the script ends.

For example, if you are making thousands of RPC requests, in this case looping through different block numbers to be used as parameters in smart contract calls, each of those calls you’re making will go one… wait till it completes, two… wait till it completes, and so on.

What this means is that for use cases like this, or any use case where you need to make a large amount of specific Web3 RPC calls, Python can get extremely slow, very quickly.

The question that we aim to answer in this tutorial is: How do we loop through or make a large amount of Web3 requests without it taking a lot of time?

Of course, there are a few ways of doing this, but for the sake of this tutorial, we will use a combination of asynchronous functions, and multi-threaded processing to achieve concurrent RPC calls, and significantly cut down the amount of time that it takes to make these large amounts of typically sequential RPC calls.

Let’s utilize asynchronous and multi-threaded Python programming to send concurrent Web3 RPC requests.

The goal of this tutorial

Loop through 500 blocks, and for each block, get the balance of an address at that specific point in time.

Import the libraries

Before we get started we need to import a few base libraries. We will import asyncio, concurrent.Futures, Web3, and time.

import asyncio
from concurrent.Futures import ThreadPoolExecutor
from web3 import Web3
import time

Define the base Web3 object

web3 = Web3(Web3.HTTPProvider{"input RPC URL here"})

Remember our goal. We need to find the balance of an address, at different points in time in the past. This is what we call historical states. In order to do this, we’ll need an archive node.

Spin up a Chainstack node

You can deploy an archive node through Chainstack if you subscribe to our Growth plan.

Follow the steps in the video below to do that.

After you deploy the node, get your endpoint.

Please note that another important aspect that impacts the speed at which this process will go is node latency.

Define other variables

address = "input wallet address here" //this will be the address of the account that we'll be pulling the balance for
start_block = web.eth.block_number //this will be the most recent block
end_block = start_block - 500

Define the main function

Let’s go ahead and define the main function for actually getting the balance of the address itself.

def get_balance_at_block(block_num):
    balance = web3.eth.get_balance(address, block_identifier = block_num);
    print(f"Balance at block {block_num}: {web3.from_wei(balance, 'ether')} ETH")

Now that we have this function, we’ll go ahead and define our main asynchronous function that will call ThreadPoolExecutor which will call the function for every block in the range that we’ve defined.

Define async function

async def main():
  with ThreadPoolExecutor(max_workers = 10) as executor:
    tasks = [
      loop.run_in_executor(
        executor,
        get_balance_at_block,
        block_num
      ) for block_num in range(start_block, end_block, -1)
    ]
    await asyncio.gather(*tasks)

Let’s look at what the code does.

We defined a function called main that will run with ThreadPoolExecutor, which has 10 workers assigned.

We defined a list called tasks, in which we called a variable called loop, which will be defined a bit later, and we called the run_in_executor method on the loop variable, through which we passed 3 parameters (executor, get_balance_at_block, and block_num)

Then we loop through the blocks in the range start_block <-> end_block in descending order.

Define the loop variable

loop = asyncio.get_event_loop()

Under the loop definition, we will also start a clock. You do not have to do this if you do not want to, but we will do it for demonstration purposes in order to see how much time it takes for this entire process to happen.

start = time.time()
loop.run_until_complete(main())
print(f"Time to completion: {time.time() - start}")

Perfect, now you should have the complete script. You can go ahead and run it, and see what it does.

When you compare the method above with the standard way of processing requests like this in Python, you will actually get some surprising results in terms of speed.

In our case, with the method shown above it took a bit over 5 seconds to get the job done, while a standard Python setup, that uses no asynchronous or multi-threaded processing took 13.5 seconds for the same task.

Standard Python takes a lot more time because it does not implement concurrent processing.

JavaScript version

We will use standard asynchronous functions in JavaScript. Specifically, we will process this through just a Promise.all method.

const { Web3 } = require('web3');
const web3 = new Web3('your Chainstack endpoint here')
const address = 'input wallet address here';
async function getBalanceAtBlock(blockNum) {
    const balanceWei = await web3.eth.getBalance(address, blockNum);
    console.log(`Balance at block ${blockNum}: ${web3.utils.fromWei(balanceWei, 'ether')} ETH`);
}
async function main() {
    let startBlock = await web3.eth.getBlockNumber();
    startBlock = Number(startBlock);
    const endBlock = startBlock - 500;
    const blockRange = Array.from({length: parseInt(startBlock - endBlock + 1)}, (_, i) => startBlock - i);
    const start = Date.now();
    await Promise.all(blockRange.map(blockNum => getBalanceAtBlock(blockNum)));
    const end = Date.now();
    console.log(`Time taken: ${(end - start) / 1000} seconds`)
}
 main();

When compared to the Python scripts, this one took 3.1 seconds.

Recap of what we did

Imported the variables.
Defined the RPC node itself.
Defined some base variables.
Defined the get_balance_at_block function that gets the balance of the address at the block number in the range that we are looping through.
Defined the main function, that uses the ThreadPoolExecutor to run get_balance_at_block concurrently through the range of blocks that we defined.
Created a clock to measure how much time it takes to finish the whole process.

Congratulations if you’ve made it to the end. If you are looking for more tutorials feel free to check out our docs and Web3 tutorials.