Solana Python tutorial: Querying and analyzing data for STEPN mints
This blog is for people who:
- Want to learn how to retrieve data from Solana.
- Want to learn about Metaplex NFTs.
- Interest in STEPN.
To elaborate on that, this article shows you how to query data from a Solana node endpoint for newly minted STEPN shoes, and visualize the retrieved data using Python.
The key takeaways:
- How to query data from Solana using Solana.py.
- How to identify transactions for NFT minting.
- Retrieving and decoding metadata for Metaplex NFTs.
- Retrieving shoe attributes from STEPN server.
- Data visualization with Python.
You can use any Solana node endpoint for this tutorial. However, some endpoints may give you errors for rate restriction or authentication.
The sample code is developed and tested with a Chainstack Solana node endpoint—it is highly recommended for the following reasons:
- No limits on the request rate.
- Free Solana node endpoint with 3 million free requests.
- Fast node deployment.
Simply follow these steps to deploy a Solana node:
In this article, an HTTPS endpoint is used.
What is STEPN?
In case you have never heard about it, STEPN is a popular Web3 game launched in December 2021.
The game is developed around the concept of move-to-earn—users earn their native token GST by walking, jogging, or running. GST can be used to purchase power-ups or repair shoes in-game, or simply can be exchanged for other tokens. STEPN was first launched on Solana and quickly became the most anticipated game on the chain.
To use STEPN, users must own at least one sneaker. Currently, there are 4 types of shoes available on market: common, uncommon, rare, and epic. The rarer a shoe is, the better its stats are.
More details can be found on STEPN’s official website.
Check the source code for this blog post.
You may use a Jupyter notebook to open it locally. You can also simply click the link below to use an online notebook:
The first thing to do is fill in the HTTPS endpoint in
Fill in your Chainstack Solana endpoint here.
Then click Runtime > Run all:
Or press this run button to go cell-by-cell.
That’s all that’s needed.
The script takes about 5 minutes to complete. It retrieves data for the past 30 minutes from Solana and plots the data on graphs.
When it is done, you should get a list of newly minted shoes:
Distribution of shoe types. For example, only 1.82% of the total mints are trainers.
Distribution of rarity.
Most shoes are common, only 1.82% of them are rare. No epic was minted.
More than half of the shoes have been minted for 2 times.
Most shoes are between level 5 and level 10.
This is the attribute distribution. Efficiency is probably the most important attribute among the four. However, there is one shoe with 160 lucks, what a lucky shoe.
The ultimate developers’ guide
Below is a high-level schema of the program.
You will be guided step-by-step through this.
Solana.py is a python SDK for Solana. It can be installed by running:
pip install solana
Load an RPC endpoint:
from solana.rpc.api import Client http_client = Client("https://api.devnet.solana.com")
It does not have official documentation, the developers may use Solana’s official RPC API doc as a reference.
Querying STEPN transactions
The first step you need to do is get all transactions related to STEPN. This can be done via with getSignaturesForAddress method.
getSignaturesForAddress returns all transaction signatures related to a designated account. For STEPN, the account is
The Python code:
Which is equivalent to getting all transaction history in Solana explorer:
The sample script queries for 200 transactions signatures at a time and pauses for 3 seconds before starting the next round.
Identify the transactions for shoe minting
Not all the transactions associated with address
STEPNq2UGeGSzCyGVr2nMQAzf8xuejwqebd84wcksCK are for shoe minting. This account is the main address for STEPN, transactions that go through it include:
- Receiving SOLs from users for shoe purchasing.
- Receiving GMT from users for shoe upgrading.
- Shoe minting.
- Selling shoes or GST.
So the next step, is to filter out the non-relevant transactions.
For STEPN, shoes are minted as Metaplex NFT, Metaplex is an NFT standard for Solana. This is a typical transaction for minting shoes.
It usually involves the following accounts.
And token as output.
A minting transaction usually has the following instructions in order:
Therefore, to find out which transactions are for NFT minting, all we need to do is identify transactions with the above mentioned information.
Gets the job done.
It compares the main accounts and transaction schema and makes sure only minting transactions are passed to next stage.
Getting metadata for new shoes
Open the Solana explorer and search for any STEPN NFT: example. The metadata and attributes can be found on the page.
Most Metaplex NFTs come with these two pieces of information. Metadata is usually standardized and is defined during minting. Attribute varies from token to token, and usually is not defined during minting.
For STEPN shoes, the attribute for shoes is retrieved from their API endpoint. The endpoint address is defined in the metadata information. For example for this shoe, its attributes information is hosted on
To get the metadata information, we need to find its metadata account. The metadata account is created during the minting transaction, which is always the first transaction in mint account history.
Open the minting transaction page. The instruction to define and store metadata is always the third instruction executed.
With this piece of knowledge in mind, in the Python code, we use:
metadataIdx = txDetail["result"]["transaction"]["message"]["instructions"]["accounts"] metaDataAddr = txDetail["result"]["transaction"]["message"]["accountKeys"][metadataIdx]
To retrieve metadata address.
Gets metadata information with:
data = http_client.get_account_info(mintObj["metaAddress"])
Then finally decode the information with:
If you are wondering how decoding is done, the detailed methodology can be found in The mystery of Solana Metaplex NFT metadata encoding. Give it a read to understand why and how.
At this stage, your data should be ready for consumption. You can easily load it into a dataframe and process it.
Tips, warnings, errors, and solutions
Anyone who wants to customize or reuse the code, here is some advice that may help you along the way.
Solana network has a very high transaction throughput. There are usually 2-3 thousand transactions per hour that are related to STEPN. This program sends 1 request for every transaction to retrieve its information. The overall request count can be quite significant.
For example, to analyze 24 hours’ worth of data, 50k requests will be consumed. And it takes about 30 mins to run. Please be mindful of overhead charges.
About archive data
Solana nodes discard older transaction data to maintain scalability. The older transactions may not be retrievable on a regular Solana node.
Many Solana node providers limit the number of requests that they can receive from a user. STEPN server does that too. If an error happens during data querying, pausing between each request may solve the issue. In the Python code, you can define this parameter to fix the duration.
The predefined duration is 1.5 seconds, you can gradually increase or decrease it if needed.
One thing to take note of is this sample code is developed based on the current version of STEPN. Since the app is going through continuous development, this program may need modification in the future to make it compatible with a newer version of STEPN.
If that happens, feel free to ping me on Twitter/Telegram/Discord.
This is the end of this tutorial. Hope you will find it useful.
Thanks for reading.
- Connect to the Ethereum, Polygon, BNB Smart Chain, Avalanche, Arbitrum, Optimism, NEAR, Aurora, Solana, Polygon zkEVM, Aptos, Gnosis Chain, Cronos, Filecoin, Fantom, StarkNet, Harmony, Tezos and Fuse mainnet or testnets through the interface designed to help you get the job done.
- Get access to the Ethereum, Polygon, BNB Smart Chain, Avalanche, Cronos, Fantom and Tezos archive nodes to query the entire history of the mainnet—starting at just $49 per month.
- Choose where you want to deploy, and we will provide you with the dedicated managed infrastructure that can handle high-volume, high-velocity read/write access to the network.
- To learn more about Chainstack, visit our Knowledge Center or join our Discord server and Telegram group.
Have you already explored what you can achieve with Chainstack? Get started for free today.