• Pricing
  • Enterprise
  • Customers
  • Blog

EVM nodes: A dive into the full nodes vs. archive nodes

Introduction 

Networks based on the Ethereum Virtual Machine (EVM) typically can run two types of nodes: a full node and an archive node. 

Chainstack supports many popular EVM-based networks, including Ethereum, Polygon, BNB Smart Chain, C-Chain of Avalanche, Fantom, Harmony

The difference between a full node and an archive node is that while both store the full blockchain data that can be used to replay the network states, the archive node additionally stores the network state at each block in an archive that can be queried. 

That’s the short explanation. 

In this article, we are going to dive into some details, differences, and operations examples for the full node and the archive node. 

Geth and Erigon 

First, a quick word on the node clients. 

Go Ethereum (Geth) is by far the most popular client software for EVM-based networks and probably in the entire blockchain space in general.

For the Ethereum mainnet, you can check the node client distribution at the ethernodes crawler data

Chainstack supports running Ethereum nodes using the Geth client or the Erigon client (formerly, Turbo-Geth)—the latter being another Go implementation focused on efficiency and the second most popular client. 

In this article, we are focusing on the Geth and Erigon implementations for the full and the archive node modes. 

Full nodes and archive nodes

Let’s dive into the details. 

What is a full node

  • Stores full blockchain data. 
  • Verifies all blocks and states. 
  • All states can be regenerated from a full node, although this will require additional state re-execution past the default 128 blocks for most EVM-based clients. 

A full EVM node keeps the current state of the blockchain and handles read-only calls and state-changing calls (transactions). A full node prunes the blockchain data to conserve disk space and reduce the sync time but stores enough data to recalculate the chain events in case of necessity. This makes it more efficient to run, but it also limits requests to a specific number of blocks (128 blocks, typically). 

For example, on the Ethereum mainnet, with the average time to produce a new block at about 13 seconds, you can only retrieve the chain states from the last 28-29 minutes. Although, in theory, you can use a full node to recalculate all the intermediate states, it would take an exceptionally long time and would be very resource-intensive, and your node might run out of memory and stop. 

Default back states and the “Missing trie node” error 

Depending on the chain you access and the client you are using, you will be limited in how many blocks for the available back states you can access: 

  • Ethereum: 128 blocks 
  • Polygon: 128 blocks 
  • BNB Smart Chain: 128 blocks 
  • Avalanche C-Chain: 32 blocks
  • Fantom: The Go Opera client does not prune information, so there is no difference between a full and an archive node. 
  • Harmony: 128 blocks

You will receive a missing trie node error if you try to query a block not accessible from a full node. 

In general, getting the missing trie node error means you need an archive node. 

What is an archive node?

  • Stores everything kept in the full node and builds an archive of historical states.  
  • These are full nodes that have been configured to run in archive mode. 

Archive nodes essentially contain a snapshot of the entire blockchain and hold all previous states of the network from the genesis block (first block mined). This makes archive nodes great for quickly querying historical data without the state regeneration, and this is ideal for developers who create analytical tools, DApps, and other services that need quick history access.

As archive nodes keep the entire chain states, they are also much bigger in size than the full nodes. At the time of writing, the Ethereum mainnet is about 10 TB in size (etherscan.io). 

To start a new archive node, the system needs to synchronize all that data before it can start running on the network. This leads to high starting and maintenance costs, given that at this point it would take months to complete the syncing process, and constant maintenance is necessary to keep up with the growing disk size requirements. 

Archive mainnet state sizes for reference 

Note that these keep growing. The data is valid at the time of this blog post. 

  • Ethereum mainnet: ~12 TB
  • Polygon mainnet: ~16 TB
  • BNB Smart Chain: ~7 TB
  • Fantom mainnet: ~4 TB
  • Harmony mainnet: ~20 TB
  • Avalanche mainnet: ~3 TB   

Note that BNB Smart Chain uses the Erigon client, which allows keeping the size of the network smaller compared to Geth.

This shows how much data is involved in all these chains, and if you want to access your own node, you will need to download all that data and validate it before being able to run the node. This can take months for an archive node. 

How to deploy a node on blockchain within minutes?

You can deploy your own node in a matter of minutes thanks to Chainstack. Using our fast synchronization technology called Bolt, Chainstack allows you to deploy a full or archive node in just a few minutes, saving weeks and months of work and resources. 

To get a node: 

  1. Sign up with Chainstack.
  2. Deploy a full or an archive node

Call methods that take past states

Now it is clear that to access data older than the last 128 blocks, we need to use an archive node. The following Geth JSON-RPC methods include a parameter allowing the user to specify which block to retrieve the data from: 

Let’s have a look at each of these methods and try them out. 

Again, if you need a quick access to an archive node, get one with Chainstack

eth_getBalance 

Retrieve an address balance at a specific point in time (block).

For details, see Ethereum Wiki: eth_getBalance.

Web3.py

Retrieve an address balance from the state at block number 1 using web3.py. 

Running this code on a full node will return an error, as we are trying to get the balance from an address in block number 1.

from web3 import Web3 
node_url = "CHAINSTACK_ARCHIVE_NODE_URL" 
web3 = Web3(Web3.HTTPProvider(node_url)) 
balance = web3.eth.get_balance("0x9D00f1630b5B18a74231477B7d7244f47138ab47", 1) 
print(web3.fromWei(balance, "ether"))

We can still run eth_getBalance on a full node but cannot go back more than 128 blocks.

Web3.js

Retrieve an address balance using web3.js. In this case we are the state at block number 14641000

var Web3 = require('web3')
var node_URL = 'CHAINSTACK_ARCHIVE_NODE_URL'
var web3 = new Web3(node_URL)
web3.eth.getBalance('0x9D00f1630b5B18a74231477B7d7244f47138ab47', 14641000, (err, balance) => {
    console.log(web3.utils.fromWei(balance, 'ether'))
})

The fromWei method is used to convert the number we receive from the node (Wei) into a number that makes sense to us.  

cURL

Retrieve an address balance using cURL. In this case we are querying the state at block number 14641000.

Note that the block number and the returned value are in hex.

curl CHAINSTACK_ARCHIVE_NODE_URL \
  -X POST \
  -H "Content-Type: application/json" \
  --data '{"method":"eth_getBalance","params":["0x9D00f1630b5B18a74231477B7d7244f47138ab47", "0xDF6768"],"id":1,"jsonrpc":"2.0"}'

eth_getCode 

Returns the compiled bytecode of a smart contract. 

For details, see Ethereum Wiki: eth_getCode.

The example below will get you the bytecode of the Uniswap token at the state of the very first block when it was deployed: block 10861674.

Web3.py

from web3 import Web3 
node_url = "CHAINSTACK_ARCHIVE_NODE_URL" 
web3 = Web3(Web3.HTTPProvider(node_url)) 
code = web3.eth.get_code("0x1f9840a85d5aF5bf1D1762F925BDADdC4201F984", 10861674) 
print(code) 

Web3.js

var Web3 = require('web3')
var node_URL = 'CHAINSTACK_ARCHIVE_NODE_URL'
var web3 = new Web3(node_URL)
web3.eth.getCode('0x1f9840a85d5aF5bf1D1762F925BDADdC4201F984', 10861674, (err, byte) => {
    console.log(byte)
})

cURL

Note that the block number is in hex.

curl CHAINSTACK_ARCHIVE_NODE_URL \
  -X POST \
  -H "Content-Type: application/json" \
  --data '{"method":"eth_getCode","params":["0x1f9840a85d5aF5bf1D1762F925BDADdC4201F984", "0xA5BC6A"],"id":1,"jsonrpc":"2.0"}'

The getCode RPC method can be used to verify if a contract has been deployed or destroyed correctly.  

eth_getTransactionCount 

Returns the number of transactions sent from an address at the state of a specific block. 

For details, see Ethereum Wiki: eth_getTransactionCount.

The example below will fetch the transaction count (nonce) of an address at the state of block 14674300.

Web3.py

from web3 import Web3 
node_url = "CHAINSTACK_ARCHIVE_NODE_URL" 
web3 = Web3(Web3.HTTPProvider(node_url)) 
tx_count = web3.eth.get_transaction_count("0x9D00f1630b5B18a74231477B7d7244f47138ab47", 14674300) 
print(tx_count) 

Web3.js

var Web3 = require('web3');
var node_URL = 'CHAINSTACK_ARCHIVE_NODE_URL';
var web3 = new Web3(node_URL);
web3.eth.getTransactionCount('0x9D00f1630b5B18a74231477B7d7244f47138ab47', 14674300, (err, count) => {
    console.log(count)
})

 cURL

Note that the block number and the returned value are in hex.

curl CHAINSTACK_ARCHIVE_NODE_URL \
  -X POST \
  -H "Content-Type: application/json" \
  --data '{"method":"eth_getTransactionCount","params":["0x9D00f1630b5B18a74231477B7d7244f47138ab47", "0xDFE97C"],"id":1,"jsonrpc":"2.0"}'

The getTransactionCount RPC method is used to retrieve the nonce of an address, the nonce is an integer value representing how many transactions that account sent. It is useful to avoid duplicate transactions. 

eth_getStorageAt

Returns the value from a storage position at a given address. 

For details, see Ethereum Wiki: eth_getStorageAt.

The examples below will return the storage value of the simple storage contract.

The last value change was in block 7500943, so you can use it as a reference point to retrieve the different storage values in time.

Web3.py

from web3 import Web3
node_url = "CHAINSTACK_ARCHIVE_NODE_URL"
web3 = Web3(Web3.HTTPProvider(node_url))
storage = web3.eth.get_storage_at("0x954De93D9f1Cd1e2e3AE5964F614CDcc821Fac64", 0, 7500943)
print(storage.decode("ASCII"))

Web3.js

var Web3 = require('web3'); 
var node_URL = 'CHAINSTACK_ARCHIVE_NODE_URL'; 
var web3 = new Web3(node_URL); 
web3.eth.getStorageAt('0x954De93D9f1Cd1e2e3AE5964F614CDcc821Fac64', 0, 7500943).then(result => {
  console.log(web3.utils.hexToAscii(result));
});

cURL

Note that the block number and the returned value are in hex.

curl CHAINSTACK_ARCHIVE_NODE_URL \
  -X POST \
  -H "Content-Type: application/json" \
  --data '{"method":"eth_getStorageAt","params":["0x954De93D9f1Cd1e2e3AE5964F614CDcc821Fac64", "0", "0x72748F"],"id":1,"jsonrpc":"2.0"}'

eth_call 

Does a read-only call on the blockchain without changing any state.

For details, see Ethereum Wiki: eth_call.

The examples below call the balanceOf function of the Chainlink token for the Chainlink VRF coordinator address at block 14000000.

Web3.py

import json
from web3 import Web3
node_url = "CHAINSTACK_ARCHIVE_NODE_URL"
web3 = Web3(Web3.HTTPProvider(node_url))
abi=json.loads('[{"constant":true,"inputs":[],"name":"name","outputs":[{"name":"","type":"string"}],"payable":false,"stateMutability":"view","type":"function"},{"constant":false,"inputs":[{"name":"_spender","type":"address"},{"name":"_value","type":"uint256"}],"name":"approve","outputs":[{"name":"","type":"bool"}],"payable":false,"stateMutability":"nonpayable","type":"function"},{"constant":true,"inputs":[],"name":"totalSupply","outputs":[{"name":"","type":"uint256"}],"payable":false,"stateMutability":"view","type":"function"},{"constant":false,"inputs":[{"name":"_from","type":"address"},{"name":"_to","type":"address"},{"name":"_value","type":"uint256"}],"name":"transferFrom","outputs":[{"name":"","type":"bool"}],"payable":false,"stateMutability":"nonpayable","type":"function"},{"constant":true,"inputs":[],"name":"decimals","outputs":[{"name":"","type":"uint8"}],"payable":false,"stateMutability":"view","type":"function"},{"constant":false,"inputs":[{"name":"_to","type":"address"},{"name":"_value","type":"uint256"},{"name":"_data","type":"bytes"}],"name":"transferAndCall","outputs":[{"name":"success","type":"bool"}],"payable":false,"stateMutability":"nonpayable","type":"function"},{"constant":false,"inputs":[{"name":"_spender","type":"address"},{"name":"_subtractedValue","type":"uint256"}],"name":"decreaseApproval","outputs":[{"name":"success","type":"bool"}],"payable":false,"stateMutability":"nonpayable","type":"function"},{"constant":true,"inputs":[{"name":"_owner","type":"address"}],"name":"balanceOf","outputs":[{"name":"balance","type":"uint256"}],"payable":false,"stateMutability":"view","type":"function"},{"constant":true,"inputs":[],"name":"symbol","outputs":[{"name":"","type":"string"}],"payable":false,"stateMutability":"view","type":"function"},{"constant":false,"inputs":[{"name":"_to","type":"address"},{"name":"_value","type":"uint256"}],"name":"transfer","outputs":[{"name":"success","type":"bool"}],"payable":false,"stateMutability":"nonpayable","type":"function"},{"constant":false,"inputs":[{"name":"_spender","type":"address"},{"name":"_addedValue","type":"uint256"}],"name":"increaseApproval","outputs":[{"name":"success","type":"bool"}],"payable":false,"stateMutability":"nonpayable","type":"function"},{"constant":true,"inputs":[{"name":"_owner","type":"address"},{"name":"_spender","type":"address"}],"name":"allowance","outputs":[{"name":"remaining","type":"uint256"}],"payable":false,"stateMutability":"view","type":"function"},{"inputs":[],"payable":false,"stateMutability":"nonpayable","type":"constructor"},{"anonymous":false,"inputs":[{"indexed":true,"name":"from","type":"address"},{"indexed":true,"name":"to","type":"address"},{"indexed":false,"name":"value","type":"uint256"},{"indexed":false,"name":"data","type":"bytes"}],"name":"Transfer","type":"event"},{"anonymous":false,"inputs":[{"indexed":true,"name":"owner","type":"address"},{"indexed":true,"name":"spender","type":"address"},{"indexed":false,"name":"value","type":"uint256"}],"name":"Approval","type":"event"}]')
address = "0x514910771AF9Ca656af840dff83E8264EcF986CA"
contract = web3.eth.contract(address=address, abi=abi)
balance = contract.functions.balanceOf('0x271682DEB8C4E0901D1a1550aD2e64D568E69909').call(block_identifier=14000000)
print(web3.fromWei(balance, 'ether'))

Web3.js

const Web3 = require('web3');
const web3 = new Web3(new Web3.providers.HttpProvider("CHAINSTACK_ARCHIVE_NODE_URL"));
web3.eth.defaultBlock = 14000000;
web3.eth.call({
        to: "0x514910771AF9Ca656af840dff83E8264EcF986CA",
        data: "0x70a08231000000000000000000000000271682deb8c4e0901d1a1550ad2e64d568e69909"
    })
    .then(result => {
        console.log(web3.utils.fromWei(result));
    });

cURL

Note that the block number and the returned value are in hex.

curl CHAINSTACK_ARCHIVE_NODE_URL \
  -X POST \
  -H "Content-Type: application/json" \
  --data '{"method":"eth_call","params":[{"from":null,"to":"0x514910771AF9Ca656af840dff83E8264EcF986CA","data":"0x70a08231000000000000000000000000271682deb8c4e0901d1a1550ad2e64d568e69909"}, "0xD59F80"],"id":1,"jsonrpc":"2.0"}'

Conclusion 

As a recap, archive nodes hold the “history” of the blockchain and have a record of every previous state of the network from the genesis block. This means that historical data can be accessed can be retrieved very quickly, and using Chainstack you can set up an archive node in a breeze! 

Archive nodes are a great tool for development, especially when you need to query past data, for example if you are forking the mainnet using Hardhat, Ganache, and other development frameworks used to run a local simulated blockchain for testing and development purposes, or if you’re creating a blockchain explorer, blockchain analytics tools, blockchain indexing with protocols like The Graph and so on, because you’ll have instant access to the full chain. If you are building a DApp that needs data from within the latest 128 blocks, a full node would suffice. 

Have you already explored what you can achieve with Chainstack? Get started for free today.

SHARE THIS ARTICLE
Customer Stories

Saakuru Labs

Saakuru Labs seamlessly transitions businesses from Web2 to Web3 with a 4X infrastructure ROI using Chainstack Elastic Nodes.

CertiK

CertiK cut Ethereum archive infrastructure costs by 70%+ for its radical take on Web3 security.

Defined

Defined deliver real-time blockchain data for over 2M tokens and 800M NFTs with reliable Web3 infrastructure.