Vana, the driving force behind the first data DAO in the AI era: a small Bittensor that defends user data rights

24-06-20 16:51
Read this article in 18 Minutes
总结 AI summary
View the summary 收起

Reddit revealed in its IPO prospectus in February this year that it had achieved a total revenue of $203 million through data licensing agreements with AI companies. The reason why AI companies are willing to spend a lot of money is that data, like computing power, is an essential key resource for developing AI models.

Sadly, none of this revenue flows to Reddit users, even though they are the actual creators of more than 1 billion posts and more than 16 billion comments on the platform.

The first data DAO, the first user-owned data network

Amid such great injustice, the first data DAO "r/datadao" came into being on April 4. It encourages users to export their data on the Reddit platform and upload it to the community database, and vote together to rent the data to AI companies to share profits. Users can also obtain governance tokens RDAT based on their data contributions.

Later, the media revealed that the promoter behind r/datadao was a startup company Vana that raised $20 million from VCs such as Paradigm and Polychain Capital. Inspired by this news, RDAT soared 50 times from the opening price of $0.011 to a maximum of $0.67 in 5 days. However, the mode of increasing the supply at any time without a fixed total amount was later opened, causing the price of RDAT to dive to the initial starting point and never rise again.

Generally speaking, the story ends here. When the currency project goes like this, it is basically declared Game Over.

We have always focused on r/datadao, but ignored what the giant behind it, Vana, wants to do. r/datadao is just Vana's attempt to take back the rights of user data from the giants. What it really wants to do is to build the first blockchain network of user-owned data. In this open Internet, users own and manage their data, as well as the smart products created by this data. Users gain ownership of AI models by contributing data, and value flows to users and independent model developers, not centralized platforms.

Yes, Vana wants to build a new and unique public chain. On June 11, this public chain ushered in its first test network "Satori Testnet". It turns out that the good show of smashing the giants' rice bowls has just begun.

The new public chain is actually a small Bittensor?

In fact, Vana is not new. Because in many ways, it is very similar to Bittensor.

Bittensor is most praised for establishing a market mechanism for artificial intelligence development in which "multiple subnets compete with each other to improve the quality of digital commodity production."

It can be said that Vana imitated the entire mechanism of Bittensor to serve its own needs of "establishing an efficient data liquidity network."

So, Vana proposed the concept of DLP for it, which is like a subnet to Bittensor. To understand Bittensor, the most important thing is to understand "subnets." Similarly, to understand Vana, you must understand "DLP."



DLP stands for "Data Liquidity Pool", which translates to "Data Liquidity Pool" in Chinese.

Vana is an EVM-compatible chain based on Proof of Stake consensus at the blockchain level, and DLP is actually a smart contract on the Vana network.

Data DAOs like r/datadao are the specific manifestations of DLP. Builders on the Satori testnet are currently developing various data DAOs such as ChatGPT Data DAO, LinkedIn Data DAO, Twitter Data DAO, Github Data DAO, etc.

In the future, 16 DLP slots will be launched on the Vana mainnet, and the DLPs on all testnets will be selected by DAT holders, the native Gas token of the Vana network, through voting based on indicators such as the total number of transactions, transaction fees, verified data uploads, and the number of independent wallet interactions.

Just like subnets can get Bittensor's TAO emission rewards, DLPs can also get Vana's DAT emission rewards. Of course, DLPs that are not selected are not unable to get emission rewards, but they are not as many as the 16 top DLPs above. New DLPs must also prove themselves by running for a period of time without emission rewards.

Users who submit data to DLP are called "data contributors" and are rewarded with specific tokens from DLP based on the quality of the data they contribute, similar to miners on the Bittensor subnet who are rewarded with TAO tokens for completing various tasks. Each DLP implements its own proof of contribution function based on its specific dataset. For example, r/datadao determines the value of contributed data by measuring the user's Karma value and requires users to post a code in their Reddit profile to confirm ownership.

There is a mundane but important detail here. I wonder if readers have noticed it? That is, the rewards issued by DLP to users are not the native Gas token DAT of the Vana network, but its own specific governance tokens! In other words, Vana allows each DLP to create its own dataset-specific tokens, giving DLP full control over the token economy of its pool. r/datadao issues its own governance token RDAT to users and has full control over its token supply. These are the biggest differences between Vana and Bittensor at the moment, and they are also an issue I will focus on at the end of this article.

Nagoya Consensus

User-submitted data first needs to be scored by the validator nodes, which will verify it according to the standards set by the creators of DLP. In this process, Vana uses a fuzzy consensus mechanism similar to Bittensor's Yuma consensus, the Nagoya consensus, that is, a group of validators jointly evaluate the quality of user-submitted data and use a weighted average to determine the final score.

In addition, validators will also score the scoring behavior of other validators. If a validator gives a high score to a file of poor quality, then other validators will give this node a low score.

Every 1800 blocks (about 3 hours) is an epoch. After each cycle, the DLP contract will distribute the emission rewards obtained to the verification nodes according to the final score. This mechanism not only inhibits the behavior that deviates from the consensus majority, but also encourages the verification nodes to make honest evaluations of data contributions.

All of the above transactions will be verified by the propagation nodes for transaction validity and added to the blocks of the Vana network for confirmation. The propagation nodes can earn transaction fees and emission rewards, which is no different from other EVM-compatible chains based on proof-of-stake consensus.

Users self-host data

It is worth noting that although users submit personal data to the Data DAO, this data is not actually on the chain.

It should be noted that data is not like tokens, which is non-exclusive. Once it is publicly available on the chain, it can be copied at will. In order to make data liquid, users must first be guaranteed to have control over their private data and ensure that the data will not be used multiple times without the owner's consent, that is, the "double-spending problem" of data must be solved.

In this regard, Vana uses sophisticated and rigorous design to make the flow of user data like a carefully choreographed dance.

First, the data contributor encrypts the data with a symmetric key and stores the encrypted data in a personal cloud storage account such as Google Drive. After obtaining the URL and unique identifier (ETAG) of the data, this information is recorded on the Vana blockchain together with the encryption key. Next, the verification node is selected as the root verification node, which is responsible for coordinating other verification nodes to download, decrypt and verify the data file. Through the fuzzy consensus mechanism, the verification nodes confirm the validity of the data and record the results in the blockchain to form an index of a valid file.

When the data queryer initiates an access request, the root verification node organizes the verification nodes to download, decrypt and summarize the data again, and finally passes the securely verified data to the queryer. Throughout the process, only legitimate verification nodes can decrypt and access the data through the permission control of the blockchain, preventing unauthorized download and decryption operations.

Based on the solid data liquidity layer and blockchain layer, Vana has created an open application layer for data contributors and developers to collaborate. Developers can use the data liquidity accumulated by DLP to build applications, and the contributor community can create real economic value from their data.

The first step to realize the dTAO mechanism?

As mentioned above, the biggest difference between Vana and Bittensor is that it allows DLP to have its own token economy.

I believe that most people have the same doubts as me at the beginning: Why should each DLP create its own token? What if they mess up (r/datadao's governance token RDAT has experienced a sharp rise and fall), I don't see any necessity to do so.

After consulting with the Vana team, I realized that Vana not only wanted to use Bittensor's mechanism intact, but also wanted to respond more actively to the challenges Bittensor is currently facing. Allowing each DLP to have its own token economy is the core of the Dynamic TAO (BIT001) network upgrade that Bittensor has been promoting but has made slow progress.

Bittensor's Dynamic TAO upgrade aims to delegate the TAO emission allocation rights originally determined by a few validators in the root network to all TAO holders through a market-based dynamic pricing mechanism. For this mechanism to work, each subnet must first issue its own token (dTAO token) and establish a liquidity pool consisting of subnet tokens and TAO. TAO holders can choose to pledge TAO to the liquidity pool corresponding to different subnets to obtain specific dTAO tokens for that subnet.

Each subnet dTAO token has its own independent supply, and subnet validators use dTAO to participate in consensus and receive rewards. The price of each pool is determined by the ratio of TAO and dTAO reserves in it, reflecting the market demand for the subnet. Bittensor injects newly issued TAO into each pool in proportion to the dTAO pool price.

This changes the original TAO allocation method based on root network voting to allocation based on the price ratio of each dynamic TAO pool, allowing all TAO holders to "vote with their feet" through staking behavior to decide which subnets should receive more TAO rewards.

Although Vana's current official documents still indicate that the "root network" module is responsible for managing DLP and token reward allocation, it can be clearly seen that Vana wants to get rid of this centralized governance mechanism before the mainnet is launched, taking a braver step forward than Bittensor.

Why do we say "a braver step forward than Bittensor"? That's because, although the original method of voting by the root network verification node to determine the allocation is relatively centralized, the verification node has the motivation to maintain its reputation and will be relatively prudent. Opening it to all coin holders may lead to a swarm of speculators, affecting the stability of the ecosystem. Since the price of the dTAO pool is completely determined by market supply and demand, when large holders suddenly pledge or cancel a large amount of pledges, it may cause drastic price fluctuations, resulting in a large fluctuation in the TAO allocation originally based on its price ratio, causing systemic risks.

If Vana really intends to adopt a similar approach to Bittensor's dTAO mechanism after the mainnet is launched, I think Vana must be prepared to deal with the above problems in advance. Its former teacher Bittensor is still behind it, and there is no one ahead to trial and pave the way for it.

欢迎加入律动 BlockBeats 官方社群:

Telegram 订阅群:

Telegram 交流群:

Twitter 官方账号:

举报 Correction/Report
Choose Library
Add Library
Add Library
Visible to myself only