Flash Depth Data Forum

Forum

Search Extension

Taking new public chains such as Aptos as an example, explain Layer1 parallel execution in detail

2022-09-07 17:29

Read this article in 26 Minutes

Parallel execution engines are a promising solution to increase the throughput of smart contract platforms

Original author: Mohamed Fouda, partner of Volt Capital

Original compilation: Shenchao TechFlow

When we re-examine the evolution of blockchain technology, we can see a strong trend It is emerging that the new L1 focuses on parallel execution. This is nothing new, Solana is currently used in Sealevel's execution environment.

However, the impressive performance of DeFi and NFT in the past bull market has also made people realize that the technology is in dire need of improvement. In the next round of the market, some well-known projects that adopt the concept of parallel execution are about to appear. The list of these projects is Aptos, Sui, Linera and Fuel.

This article will discuss the similarities and differences of these projects, as well as the challenges they face.

Issue

Smart contract platforms can create a wide range of decentralized applications . In order to execute these applications, a shared computing engine is required. Every node in the network runs this computing engine, as well as executes the application and the user's interaction with the application. When nodes get the same result from execution, they reach consensus and move the chain forward.

Ethereum virtual machine is the main smart contract (SC) execution engine, there are about 20 different implementations. Since the EVM was invented, it has built a critical mass of developer adoption. In addition to Ethereum and Ethereum’s L2, several other chains including Polygon, BNB Smart Chain, and Avalanche C Chain all adopt EVM as the execution engine and focus on changing the consensus mechanism to improve network throughput.

A major restrictive feature of the EVM is the sequential execution of transactions. EVM basically executes one transaction at a time, putting all other transactions on hold until the transaction is executed and the blockchain state is updated. Even if two transactions are independent, for example, a payment from Alice to Bob and another payment from Carol to Dave, the EVM cannot execute these transactions in parallel. While this mode of execution allows for interesting use cases such as flash loans, it is neither efficient nor scalable.

This sequential execution of transactions is one of the main bottlenecks in network throughput: first, it causes transactions in blocks to take longer to execute, limiting the block time; moreover, it limits the number of transactions that can be added to a block in order for nodes to execute the transactions and confirm the block. The average throughput of Ethereum is about 17 tx/s. This low throughput means that during periods of high activity, such as NFT Mint, network miners/validators cannot process all transactions, ensuing fee bidding wars to ensure priority execution, driving transaction fees up. The average fee on Ethereum exceeded 0.2 ETH (~$800) at some point, which discouraged many users from using Ethereum. The second problem with sequential execution is the inefficiency of network nodes. Sequential instruction execution cannot benefit from multiple processor cores, which leads to low hardware utilization and inefficiency. This hinders scalability and leads to unnecessary energy consumption.

Can parallel execution solve this problem?

The constraints of the EVM architecture allow for a new L1 realm of parallel execution (PE). Parallelism allows transaction processing to be divided among multiple processor cores, increasing hardware utilization and thus enabling better scalability. In high-throughput chains, increasing hardware resources is directly related to the number of transactions that can be executed. During periods of high activity, validator nodes can delegate more cores to handle additional transaction load. Dynamic scaling of computing resources allows the network to achieve higher throughput during periods of high demand, significantly improving user experience.

Another advantage of this method is that it improves the delay of transaction confirmation, and the dynamic expansion of node resources makes it possible to confirm all possible network loads with low-latency transactions . Transactions do not need to wait for tens or hundreds of blocks, nor do they need to incur excessive fees for priority confirmation. Improved confirmation times increase transaction finality, opening the door to low-latency blockchains. Guaranteed low latency for executing transactions enables several previously impossible use cases.

Changing the chained execution mode to allow PE is not a new idea, some projects have explored it. One approach is to replace the accounting model used by the EVM from the Accounts model to the Unspent Transaction Output (UTXO) model. The UTXO execution model used in Bitcoin allows transactions to be processed in parallel, which makes it ideal for payments. But due to the limited functionality of UXTO, it needs to be extended to enable the complex interactions required by smart contracts. For example, Cardano uses an extended UTXO model for this purpose, while Findora uses a hybrid UTXO model, which implements two accounting models and allows users to change asset types between the two models.

Another approach to PE does not change the account model, but instead focuses on improving the architecture and modification of the chain state. For example Solana's Sealevel framework.

How does parallel execution work?

Parallel execution works by identifying independent transactions and executing them concurrently. Two transactions are associated if the execution of one transaction affects the execution of another transaction. For example, AMM transactions in the same pool are linked and must be executed sequentially.

While the concept of parallel processing sounds simple, the difficulty lies in the details, the main challenge being how to efficiently identify "independent" transactions. The classification of independent transactions requires an understanding of how each transaction changes the blockchain memory or chain state, transactions that interact with the same smart contract (such as an AMM pool) can change the contract state at the same time, and therefore, cannot be executed at the same time.

With the current degree of composability between applications, identifying whether they are related to each other is a challenging task. Imagine an AMM transaction that swaps UNI for USDC, and the AMM finds that the most efficient route to execute it is UNI -> ETH -> DAI -> AAVE -> USDC. All pools participating in the transaction cannot process any other transactions until the transaction is fully executed, then the state of all participating pools can be updated.

Identifying Independent Transactions

In this section, different parallel executions The methods used by the engines are compared. The focus is on methods for controlling state (memory) access. The blockchain state can be thought of as a RAM memory, and each account on the chain, or smart contract, has a series of memory locations that can be modified. Related transactions are those that attempt to change the same memory location in the same block, and different chains utilize different memory architectures and different mechanisms to identify these transactions.

Several chains in this category are built on technology developed by Facebook's defunct blockchain project Diem. The Diem team created the smart contract language Move specifically to improve SC execution. Aptos, Sui, and Linera are three high-profile projects that fall into this group. Besides this group, Fuel is another well-known project focused on PE, using its own SC language.

Aptos

Aptos builds on Diem's Move language and MoveVM, creating A high-throughput chain that enables parallel execution. Aptos' approach is to detect associations while being transparent to users/developers, i.e. not requiring transactions to explicitly state which part of state (memory location) they use.

Aptos uses a modified version of Software Transactional Memory (STM), called Block-STM. In Block-STM, transactions are pre-ordered within blocks and divided among processor threads in order to be executed. In the process, the execution of the transaction is assumed to be unrelated. The memory locations modified by transactions are recorded, and after execution, the results of all transactions are verified. During verification, if a transaction is found to access a memory location that was modified by a previous transaction, the transaction will be annulled. The results of the transaction are flushed and re-executed. This process is repeated until all transactions in the block have been executed. When multiple processor cores are used, Block-STM speeds up execution, depending on how interconnected the transactions are.

The results of the Aptos team show that using 32 cores can improve high correlation performance by 8 times and low correlation performance by 16 times. If all transactions in a block are interdependent, then Block-STM can cause a slight penalty in performance compared to sequential execution. Aptos claims that this approach can achieve a throughput of 160,000 TPS.

Sui

Another PE approach is to require transactions to explicitly declare which chains they modify State section, this approach is currently used by Solana and Sui. Solana refers to units of memory as accounts, and a transaction must state which accounts it modifies. Sui also used a similar approach.

Sui also builds on Diem's technology by using MoveVM. However, Sui uses a different version of the Move language. The implementation of Sui Move changes Diem's core storage model and asset permissions, which represents a significant difference from Aptos using core Diem Move. Sui Move defines a state storage model that allows for easier identification of independent transactions.

In Sui, state storage is defined as Objects. Objects typically represent assets and can be shared, meaning multiple users can modify the object. Each Object has a unique ID in the Sui execution environment and has an internal pointer to the address of the owner. By using these concepts, it is easy to identify associations by checking whether transactions use the same Objects.

Easier implementation of the execution engine by shifting the work of declaring associations to the developer, which means in theory it can have better performance and scalability. However, this comes at the cost of a less-than-ideal developer experience.

Sui has not been launched yet, the testnet has just been launched recently. The founders of Sui claim that the implementation of parallel execution and the use of Narwhal and Tusk consensus mechanisms resulted in a throughput of over 100,000 tx/sec. This throughput, if true, could be a big boost over Solana's current ~2400 tx/sec throughput, and would exceed the throughput of Visa and Mastercard.

Linera

Linera is the newest entrant in the world of parallel processing and recently announced their first A round of financing led by a16z. There are few details about the implementation of the project. However, based on their funding announcement post, we know it's based on the FastPay protocol, also developed at Facebook. Fastpay is based on a technology called Byzantine Consistent Broadcast, which is focused on accelerating individual payments, such as those that occur in point-of-sale networks. It allows a group of validators to ensure the integrity of a payment as long as more than two-thirds of the validators are honest. Faster Payments is a variant of the Real Time Gross Settlement (RTGS) system used in networks between banks and financial institutions.

Based on FastPay, Linera is planning to build a blockchain that focuses on fast settlement and low latency by executing payment transactions in parallel. It is worth noting that Sui also uses Byzantine Consistent Broadcast for simple payments. For other transactions, Sui's own consensus mechanism Narwhal and Tusk are used to efficiently handle more complex and relational transactions such as DeFi transactions.

Fuel

Fuel is focused on being the execution layer in a modular blockchain, which Meaning that Fuel does not implement consensus or store blockchain data on the Fuel chain. For functional blockchains, Fuel interacts with other chains for consensus and data availability, such as Ethereum or Celestia.

Fuel uses UTXO to create a strict access list, that is, a list to control access to the same piece of state. This model builds on the concept of canonical transaction ordering. In this scheme, the ordering of transactions in blocks leads to a significant simplification in detecting associations between transactions. To implement this architecture, Fuel built a new virtual machine called FuelVM and a new language called Sway.

FuelVM is a compatible and simplified representation of EVM, which can effectively allow developers to join the Fuel ecosystem. Additionally, due to Fuel's focus on modular blockchains, the execution of Fuel SC can be resolved on the Ethereum mainnet. This approach aligns with the vision of a merged Ethereum as a Rollup-centric settlement and data availability layer. In this architecture, Fuel can achieve high-throughput execution of batches and settlements on Ethereum. To prove the concept, the Fuel team has created an AMM called SwaySwap, similar to Uniswap, and running it on a testnet. The purpose is to demonstrate the higher performance of FuelVM compared to EVM.

Challenges of Parallel Execution Approach

Parallel Execution Approach seems logical and straightforward , however, we still face several challenges at present. The first is to estimate the actual proportion of transactions that can be accelerated using this type of parallel execution. The second challenge is the decentralization of the network, that is, if validators can easily scale computing power to increase throughput, how can full nodes keep up to ensure the correctness of the chain?

Percentage of Parallel Transactions

Accurate estimate of chains that can be executed in parallel on any chain Trading on a percentage is challenging. Also, this percentage can vary greatly from block to block, depending on the type of network activity. For example, an NFT Mint could cause an explosion with a high percentage of related transactions. That said, we can use some assumptions to get a rough estimate of the average percentage of transactions that can be parallelized. For example, we can assume that most ETH and ERC20 transfers are independent, i.e. originated from and received from different addresses. So we can assume that about 25% of ETH and ERC20 transfers are interrelated, i.e. deposits to SC and aggregation of assets from exchange hot wallets to cold wallets.

On the other hand, all AMM transactions in the same pool are correlated. Given that most AMMs are typically dominated by a small number of pools, and that AMM transactions are highly composable and interact with multiple pools, we can safely assume that at least 50% of AMM transactions are interconnected. By analyzing the transaction categories of Ethereum, we can find that among the approximately 1.2 million transactions in Ethereum per day, 20-30% are ETH transfers, 10-20% are stable currency transfers, and 10-15% are DEX transfers. 4-6% are NFT transactions, 8-10% are ERC20 approvals, and 12-15% are other ERC20 transfers. Using these numbers and assumptions, we can estimate that PE can accelerate approximately 70-80% of transactions in the SC platform. This means that the sequential execution of related transactions accounts for 20-30% of all transactions. In other words, if the same gas limit is used, it is possible to achieve a 3x-5x increase in throughput through PE. Some experiments on building parallel execution EVMs show similar estimates, where 3-5x throughput improvements can be consistently achieved.

In practice, high-throughput chains use higher gas limits and shorter block times to achieve at least 100x throughput improvements over Ethereum. The increased throughput requires powerful validating nodes to process these blocks, a requirement that leads to the second challenge, centralization of the network.

Centralization of the network

In a high-throughput network, the network can Process tens of thousands of transactions. Validators are incentivized by fees and network rewards to process these transactions, and invest in dedicated servers or scalable cloud architectures to process these transactions. This is not the case for companies or individuals who use the chain and need to run full nodes to interact with the chain. These entities cannot afford complex servers to handle such massive transaction loads. This will drive on-chain users to rely on specialized RPC node providers, such as Infura, leading to more centralization.

If you do not choose to use consumer-grade hardware to run a full node, a high-throughput chain may become a closed system, with a small number of entities owning the network absolute power. In this case, these entities can coordinate censorship transactions, entities and even applications such as Tornado Cash, which can turn these chains into permissioned systems not unlike Web 2.

Currently, the requirements to operate a full node on the Sui testnet are lower than the requirements for an Aptos testnet node. However, we expect these needs to change significantly when mainnet launches and applications start appearing on-chain.

Advocates of decentralization have been proposing solutions to these anticipated problems. These solutions include using light nodes to verify the correctness of blocks by using ZK validity proofs or fraud proofs. The Fuel team is active in this regard, in line with the spirit of the Ethereum community on the importance of decentralization. It is not clear whether the Aptos and Sui teams are prioritizing implementing these methods or otherwise promoting decentralization. The Linera team briefly discussed these issues in their introductory post, but the protocol implementation has yet to confirm this commitment.

Summary

Parallel execution engines are a promising solution for improving the throughput of smart contract platforms plan. Combined with the innovation of the consensus mechanism, the parallel execution of transactions can make the throughput of the chain close to or exceed 100,000 TPS, which is comparable to Visa and Mastercard, and can realize several of the most challenging use cases today, such as a complete chain Online games and decentralized micropayments.

These impressive throughput improvements are not without Challenges, about how to ensure decentralization, we look to founders who are working on solving these problems.