Is there a better solution for blockchain expansion?

22-11-10 10:56
Read this article in 25 Minutes
总结 AI summary
View the summary 收起

Since the beginning of this year, various Web3 projects based on blockchain technology have developed vigorously, and many applications represented by X to earn and digital collections have detonated the market and attracted a large number of users in traditional industries Enter the Web3 industry with funding.


However, with the gradual increase in the transaction volume of the upper layer application, the application of the underlying block The performance requirements of the chain system become more demanding. Therefore, without strong underlying performance support, these upper-layer Web3 applications will quickly encounter growth bottlenecks. And this has once again led to the expansion problem that has plagued the blockchain industry for a long time.


Yes, since the birth of the blockchain industry, expansion is almost accompanied by a Timeless core issues. However, after years of development, except for the compromise solution of sacrificing decentralization in exchange for performance, and the off-chain expansion technology represented by Rollup, the underlying performance of the blockchain still does not seem to have been substantially improved.


So, where are the main bottlenecks restricting the performance improvement of the entire blockchain system, and what are they? Is there room for further optimization? To answer these questions, we have to start with the basic characteristics of the blockchain system.


Characteristics of blockchain system


The fundamental significance of the blockchain system is to provide users with a trusted collaboration platform, so the blockchain system is also called "trusted machine". This most basic goal determines that the blockchain system must satisfy the following basic characteristics.


1. Verifiable


< p>

First of all, trust is not created out of thin air. The reason why users can trust the data stored in the blockchain is not only that it cannot be tampered with, but more importantly, it can be Efficient verification, this is the verifiability of the blockchain system.


Typical blockchain systems such as Bitcoin and Ethereum, the transactions stored in the chain Data exists in various "tree" structures. If you have tried to read related technical documents, then you must have an impression of names such as Merkle Tree (Merkel number) or Merkle Patricia trie (here, readers do not need to understand these technical terms, as long as they have an impression of the name , the schematic diagram is as follows). The reason why the blockchain generally uses this kind of "tree" structure to store data is actually to meet the basic needs of its data verifiability.



Although in theory, any user can verify any Want to verify the data, but this mode is obviously extremely inefficient, and it cannot meet the needs of blockchain users to verify transactions frequently.


The advantage of the tree structure is that the system only needs to provide information on a few key nodes , which proves the existence of the transaction. To give an inappropriate example, just like a residential address that also adopts a tree structure, as long as it is confirmed by a few key nodes such as provinces, cities, and districts, the validity of an address can be verified. Compared with the traversal search efficiency, it has obviously been greatly improved.


This kind of verifiable data structure has become a must for the blockchain storage system One of the characteristics.


2. Multiple versions


< p>

The blockchain is a living system, and the data stored in the system is also changed at any time due to user operations. As small as everyone's account balance, as large as the latest state of a smart contract, they are all changing with the continuous extension of the blockchain.


The state of all these different versions must be completely stored in the blockchain storage In the system, this is what we call "multi-version".


3. Continued growth


< p>

The historical data of the blockchain system will continue to grow as the running time increases, which is also called state explosion in the field of public chains. As the blockchain continues to pack and store new transactions, the storage capacity of the entire full node will increase rapidly.


At present, this situation is not obvious in Bitcoin, after all, each block of Bitcoin Basically only relatively simple transfer information is stored. But for a smart contract public chain like Ethereum, due to the need to store more smart contract codes, the growth rate of historical data will be faster.


For high-performance public chains and even alliance chains with higher TPS than Ethereum , the expansion rate of its historical data will be more rapid. However, an increase in storage capacity will inevitably lead to a decrease in the operating efficiency of the entire system. The following is a schematic diagram of the impact of an increase in storage capacity on blockchain performance.


It can be seen that whether it is transaction performance (TPS) or storage performance, it will follow the block The chain continues to grow and decay rapidly.



Why is simple data storage a big problem for blockchain systems?


From the performance attenuation diagram above, it can be seen that for many typical blockchain systems, As long as the storage scale exceeds a certain level, its overall performance or TPS will experience significant performance degradation.


The reason why the current Bitcoin and Ethereum users do not obviously feel this problem , One is subject to consensus algorithms such as PoW, another big reason is that its storage capacity has not yet reached the performance bottleneck, and its attenuation effect has not yet been clearly shown.


However, just like the extreme challenge of Double Eleven, Ali has developed the most efficient settlement in the world Like the system, the blockchain performance bottleneck caused by storage first appeared in the more efficient alliance chain field.


So can the alliance chain use traditional technology to solve this problem? No, as we mentioned above, blockchain data storage must meet verifiable and other characteristics, so this cannot be directly solved by traditional database technology.



Compared with traditional computer storage systems, blockchain storage systems need to face more challenges. The picture above is the analysis of the main challenges of solving storage bottlenecks by the Ali AntChain team. Among them, the challenges of performance, cost, and scale are relatively easy to understand. Here we briefly explain the first two issues, namely "read and write amplification" and "data locality".


As mentioned above, the blockchain system uses tree storage structure. The characteristic of tree storage is that as the amount of data storage increases, the number of layers of the tree will also increase.


Imagine an entrepreneurial team with only six or seven people, maybe there are only two bosses and employees At the management level, when looking for someone, you only need to yell in the office. For a large company like Ali, in order to manage hundreds of thousands of employees, it is not too much to arrange its internal ranks to more than 20 levels. At this time, if you want to randomly verify the business ability of an employee in such a large factory, the search and verification speed will obviously be greatly reduced.


This kind of reduction in storage efficiency due to the increase in the number of layers is the area mentioned in the above figure The first challenge of the blockchain storage system: read and write amplification.


The second "data locality" is also a storage problem unique to blockchain. We know that in order to achieve the characteristics that the blockchain system data is difficult to tamper with, all the data on the chain will be hashed, and the data and the hash will be stored at the same time.


But it is this kind of randomness that brings great challenges to data storage and query. big difficulty. Just like if the words in a dictionary are arranged in pinyin order, then the approximate location can be quickly located when querying, improving search efficiency. But if the words in the dictionary are arranged according to the random hash value of each word, then the query difficulty will obviously be greatly increased.


This "data locality" caused by the randomness of the hash greatly improves the The difficulty of disk storage and query reduces the storage performance of the blockchain system.


How does AntChain’s Letus system break through the blockchain storage bottleneck


In fact, for the above problems, many teams have made some improvements in both the alliance chain field and the public chain field try.


For example, HyperLedger in the alliance chain field chooses to weaken the verifiability to a certain extent, but this The idea is just to make trade-offs and trade-offs between different features, and it does not fundamentally solve the problem. The research and development direction of the AntChain team hopes to improve storage efficiency without compromising key attributes such as security, verifiability, and data integrity.


So how does the new Letus system launched by AntChain solve these problems?


1. Merge state tree and database


We already know that the blockchain uses a tree-like storage structure to meet verifiable requirements, but in fact the so-called "tree" is also an abstract concept . In the end, what is actually saved to the hard disk will definitely not be a tree-like graph, but a set of neatly arranged data. This requires rearranging various "trees", and using database software to organize them according to certain rules, and finally store them in the hard disk.


Let's take Ethereum's stored procedure as an example. The diagram below about the Ethereum data storage process is a bit complicated. Readers don’t need to look at the specific details. They just need to know that the storage of Ethereum also starts from various abstract “tree” structures (top) and finally becomes a format that conforms to the database format. Tidy up the data (at the bottom), and then save it to the hard disk.



< p>Yes, although we say that the blockchain is not a simple database, when the blockchain data is finally stored in the hard disk of the node, it still needs to be organized and finally saved by database software.


The Letus system of AntChain, in order to improve storage efficiency, breaks through the verifiable The "tree" structure is directly integrated with the lower-level database system, which is equivalent to merging the three-level structure in the above figure into one level. In this way, the verifiable data structure and the underlying storage can be deeply optimized by using their own characteristics, thereby improving the efficiency of storage and verification.


2. Change the storage method


In the traditional blockchain tree structure, as long as the data at the bottom (leaf node) has a data change, the The hash value will change and need to be re-stored.


The ant chain has more and more levels of trees due to "read and write amplification", so In order to avoid the need to re-store the hashes of all layers every time data is modified, and reduce the pressure on the storage system, Letus adopts a new idea of incremental storage. It is to first reserve all incremental storage as incremental indexes (as shown on the right side of the figure below).


This not only solves the problem of increasing the number of layers (reading and writing amplification) with the increase of data, It also reduces the number of frequently modifying the tree node hash. As long as the verifiability of these incremental information is guaranteed, the storage performance of the system can be greatly improved.



For the data locality problem mentioned above, LETUS realizes the ordered index by using the block number (ordered) as the version and other technologies to solve this problem, replace the hash random index key with the index key based on the version number. On the one hand, the data is written in the order of the block number, and on the other hand, the data storage and query are all through the ordered index ( Such as B tree), so as to avoid the random layout of data and improve the locality of data.


In addition, the Letus system also realizes such as intelligent temperature control hierarchical storage, boundary-based Many technological innovations, such as batch cutting of scanning, reduce costs and improve business life, but due to space limitations, they will not be expanded here.


What are the references for the storage scheme of AntChain


Currently popular expansion solutions in the field of public chains, in addition to off-chain expansion technologies (such as Rollup), in the expansion field of the public chain itself, basically Entered a bottleneck period. Most of the so-called high-performance public chains on the market do not fundamentally solve the performance problems of blockchains, but only make trade-offs between security, decentralization and performance.


This has led to many high-performance public chain projects on the market, and more performance improvements are Obtained by giving up the characteristics of decentralization. This is why it is difficult for Ethereum to learn from these high-performance public chain solutions. After all, Ethereum has always hoped to keep the hardware cost of nodes not higher than 2,500 US dollars, so as to maintain the decentralization of the system to the greatest extent.


However, the solution of the AntChain Letus system is not in terms of security or verifiability, or even Make any trade-offs on data integrity. This kind of innovation is actually more progressive for the entire industry.



As a practitioner who has studied public chains for a long time, I personally prefer a more open public chain ecology in terms of values, and it was true before Little attention has been paid to the technological progress of the alliance chain industry.


However, after learning about the new After technological innovation, I suddenly discovered that the alliance chain industry has never been behind closed doors. Although they seldom comment on the public chain technology in public due to compliance or other considerations, their team has never stopped researching the technological progress in the public chain field, and is actively exploring better solutions through their own methods.


And these newly developed technologies are not only applicable to the alliance chain field, but It has a strong reference value for the technological development of the entire public chain. Although we have always ridiculed that it is difficult for Web3 to happen in China in recent years, in fact, China's blockchain technology field has not really entered the level of silence. Under the premise of compliance, the technical team represented by AntChain is still exploring and improving the underlying technology of the blockchain. In the near future, it is not even ruled out that these technologies can feed back and export technology to the public chain industry.


< p>Of course, it may also be due to the fact that the to B color of the alliance chain industry is too strong, which causes it to pay less attention to communication with ordinary users (after all, there is no publicly issued Token, so there is no need to hype yourself). I hope that the team in the alliance chain field can disclose more information in the future, so that more people can learn about the latest progress of the best domestic technical team in the blockchain field.


欢迎加入律动 BlockBeats 官方社群:

Telegram 订阅群:https://t.me/theblockbeats

Telegram 交流群:https://t.me/BlockBeats_App

Twitter 官方账号:https://twitter.com/BlockBeatsAsia

举报 Correction/Report
Choose Library
Add Library
Cancel
Finish
Add Library
Visible to myself only
Public
Save
Correction/Report
Submit