[2023] Guide to Web3 Data Tools
Original article by Andrew Hong
Kxp, BlockBeats
Any tool, whether they started out as indexers, browsers, raters, query engines, now includes some raw and aggregate query statements supported by the API. Because of this, I no longer think of data tools as a technology stack, but as a function stack.
This shows that all tools have different "use cases" tailored to the analyst's usage needs, thus forming a stack. I've listed the top tools I know in the chart below, and I'll briefly cover each of them by function.
A few notes:
1. I currently have a full-time job at Dune.
2. The icon indicates that this part of the product is still in beta, so you need to request permission from the team.
3. In this article, I'm not talking about pure API products, because everyone now has an API that includes raw/decoded/aggregated data combinations. Therefore, I cannot cover all of these products. If you want to see the trend of enhanced apis, please seeThis article.
4. Queries like Dune - Dashboard providers can support all of this. However, the query-based dashboard is not yet comparable to the user experience (feature loops and low latency) of some dedicated applications, so I didn't include it.
I start by using several of the tools mentioned in this section to research a topic and find trends that I want to explore further. Finding the right problems and analyzing them is one of the hardest jobs in Web3, and using these tools will make the whole process much easier.
Over the past year, we have evolved from pure on-chain data dashboards or simple ranking tables to comprehensive discovery products with enhanced search and metadata. Anyone can explore and share data more easily than ever before -- citizen journalism has developed well.
Protocol/Domain:To understand how a protocol (or domain) performs against a wide range of metrics, you can refer to the following tools:
1. Dune Can be used to find subdivided dashboards for erc20, NFT, pledge, airdrop, auction, etc., and there is a native search function to query rankings for dashboards, wizards, and queries.
2. Flipside Can be used to discover data science writing and research, such as liquidity pool PnL, wallet ranking, etc. They are currently testing newer dashboard features.
3.DefiLlama There are high-level trends in TVL, volume and yield. You can find some great examples in their review of 2022.
4. Messari The protocol/pool interface is a good alternative to DefiLlama and is growing fast.
5.Token Terminal Making traditional analysis of revenue, revenue, and expenses on a protocol (dapp), or chain, much simpler.
6. Artemis Has the best chart that can be developed/socialized by chain and includes an overview of chain metrics. It integrates well with google sheet.
Token:While you could use more custom query dashboards to delve deeper into tokens, you might need a more optimized product interface to make the discovery and comparison process faster and easier.
1. Nansen The Token God Model and Token (DeFi/NFT) Heaven provide insights into distribution, transactions, and wallet trends.
2. DefiLlama Allows you to closely analyze the number of Token pairs and the depth of liquidity.
3. Parsec Features Bloomberg-style Token analysis, and has a customizable terminal.
DAO/ governanceAs protocol changes/updates become more frequent, governance becomes more important in tracking trends and requires investment or dipping into the coffers.
1. Messari: Give people more insight into the proposal by marking discussion, importance, governance process, and subDAOs in the governance browser.
2. Tally: Its front page shows how voting rights have changed over time, and the platform allows people to query different DAOs based on the characteristics of the chain.
3. Boardroom: It has a stream that filters all DAO proposals, but I wish it had more filters on it (for example, proposals worth more than $20K).
4. Agora: Currently only used for Nouns, but it is useful for learning about voters and who they represent.
Once you've found some interesting trends, you may want to figure out what the common transactions look like and what the wallet/address is behind all of them.
In this step, you need to be more careful to make sure you don't miss any background clues that will help you analyze -- this is how you end up with 20 different tabs open.
Transaction display: The above transaction can be either a simple ETH transfer or a slightly more complex leverage curve stETH/ETH vault. On the interface you can see the transaction content, log, contract code and status data.
1.Blocksec/Phalcon Great for studying traces and logs in transactions that clearly show the order in which all internal calls and events were made, its latest update adds code snippets and revenue and expenditure charts to the tool.
2. Tenderly Multi-link browser contains the fastest tracking. > Snippet detection, among other features. It has the smoothest user experience/user interface of all, although it could do better with event log tracing and snippets.
3. Blocknative Is a specialized browser for the memory pool (trade queue) where you can listen in real time to the transactions submitted to any contract.
Address/wallet displayFor them, the ease of quickly searching, cross-analyzing, combining and sharing wallet balances and activities is key. While there are a number of "portfolio" style apps out there, such as zapper.fi and rotki, these are not good for data analysis, so they are not included.
1. Nansen Enhanced tagging features, as well as a wallet analyzer that pairs well with their Token analysis page.
2. Zerion And Debank Nansen is the best free alternative to analyzing wallet bundles on multiple chains.
3. Warden Allows defI-level analysis of wallets (clearing, borrowing, etc.) in an easily searchable format. It currently supports Aave, and will soon support Euler, Notional, and Compound. I suspect this tool will have its own category next year, as it starts the DeFi analytics browser trend.
4. Bubblemaps Helps identify interesting wallet clusters and connections.
Hybrid (Transaction + address/wallet)These are all all-in-one solutions that will be explored to a great extent, with some of the most powerful general-purpose products in the field -- the next Google is likely to emerge from here.
1. Etherscan Good for quickly getting an overview of a single address. I usually use it for tagging and to quickly check balances and historical transfers for a particular Token (for a particular wallet). It also has a basic but easy-to-understand overview of the transaction.
2. Onceupon.gg A boon for analysts. They support the use of mini Windows in horizontal rollers, as well as quick filtering of all transactions by some method and source (including traces) within specific tokens and entities. Each transaction is labeled and described in a human-readable way before you click through. Wallet and window groups can be easily shared and understood. Best of all, all addresses have a "neighbor" tag for quick identification of top counterparties, and its graphical network visualization features will assist you in your analysis.
3. Arkham Is the most centralized network and counterparty analysis tool. You can easily check the counterparties/exchanges and balances between individuals/groups and break down the inflows/outflows of money transfers by different transactions. Its graphical networking tools make ZachXBt-style sleuthing easier than ever - you can easily adjust the timescale to see the relationship of a particular wallet over a particular period of time.
self-managedThese options allow you to run an open source block browser locally, which is also a hybrid solution.
1. Otterscan: The basic functionality is similar to etherscan.
2. Trueblocks Explorer: Perhaps the fastest account history scraping tool, its core item in the next section.
All SQL engines are cloud-based, so you can use the IDE in your browser to query raw and aggregated data (e.g., nft.trans/dex.trans). These engines also support some powerful tabular features, such as NFT wash transaction filters.
We can use these references to make perfect analysis of protocols, communities, and tokens, and some engines have more specialized tables that I'd like to explore.
Free class SQL: These engines are free to access.
1. Transpose (25-second free limit, maximum expansion time for top plans is 10 minutes).
2. Flipside(15 minutes free limit, no plans), and is now the only tool with a store/status table, byTokenFlowProvide.
3. Footprint(Free 20-minute limit, scalable computing on a paid plan), it's the only Gamefi-centric aggregation tool I've seen.
4. Dune (30-minute free limit, paid plan to expand computation), this is the only tool with a code table, so you don't have to figure out raw hexadecimal/byte conversion, function/subject signature filtering, or proxy/factory patterns.
Paid SQL: These engines can only be accessed upon payment and approval.
1. Nansen Currently testing their new query engine, you can finally query their entity/address tag table.
2. Goldsky Allows you to migrate subgraphs (or create new subgraphs) from theGraph to their hosting service. You can easily combine subgraphs, which solves one of theGraph's major flaws (albeit with the loss of decentralization).
3.Covalent Increment There is a huge table where all raw data can be queried, and more recently, aggregated data has been added.
self-managedThese are code packages that allow you to quickly get raw, decoded, and aggregated data (including RPC providers for specific chains). The advantage of this is that if you run a more efficient client (especially Erigon), you can explore more data faster. Note that Erigon takes a week to sync and is required on your hard drive. Two terabytes of space, but it may get faster in the future.
Python:
,web3.py: One of the first software packages to easily use the contract ABI and node provider (client) on Ethereum.
,ethereum-etl: Used by Nansen and google Bigquery for on-chain fetching at a specific block range and can be easily exported to different databases or file types.
,checkthechain: The only package with domain aggregation (such as uniswap pools) on top of the basic functionality of web3.py, tailored specifically for data scientists.
,ApeThat used to be the case, then became the hardhat for-vyper solution.
,mev-inspect-py: Query MEV data by block (i.e. miner payments/profits, swaps, arbitrage, etc.). This flashbots data can also be found in Dune.
Golang:
,trueblocks-coreThis tool can get the transaction history of any address very quickly, as shown in the browser above.
Javascript/Typescript:
,web3.js: The original Web3 script package. Many of its methods/patterns now seem overly complex compared to ethers.
,ethers.js/ts: A more efficient/streamlined Web3 script package widely used in front-end and smart contract test suites such as truffle and hardhat.
Rust:
,ethers-rs: It was created primarily to work with foundry (e.g. ethers.js < > hardhat).
Raw data is good, but to get better metrics, you need to standardize and aggregate data across different contracts. Once you've aggregated, you can create new metrics and tags to make your analysis more efficient.
The community at this level has the deepest portfolio of Web3 domains, technologies, and background knowledge.
Cooperative typeAll explicit views are open source and you can contribute your own code or collaborate with other analysts.
1. Dune Allows you to create any model you need, where all the chains are in one repo. It currently has 300 contributors (community members + team members).
2. Flipside It also allows you to create any model you need, but each chain has its own repo.
3. DefiLlama Allows any contribution to a predefined set of metrics, which are typically built on a subgraph.
4. theGraph Allows anyone to create a graphQL schema and map, but only within a specific network, typically one protocol (that is, an explicit contract) at a time. Messari builds on them in a more organized way (similar to Dune/Flipside/DefiLlama repo).
5. Goldsky Basically theGraph, but only for private/paying customers and written in SQL.
Some of the query statement providers in the previous section have defined their own data summaries, but these summaries are not open source and you cannot add data to them.
Overall, Web3 data tools are becoming clearer, more trusted, more social, and more collaborative. The development of information systems allows us all to query and define better metrics faster, improving what we can discover and explore in our products. The whole ecosystem is one big flywheel, and the Web3 community will eventually come together to let the data flow.
I'm looking forward to seeing where we go next year and I hope to get as many new faces together as possible. If you're building a product I haven't covered here, but you'd like to show it off and get feedback, feel free to email me. If you feel your product has been misinterpreted, or have a better key feature you'd like to highlight, please contact me and I'll consider editing it back in.
This is just the first part of my guide to 2023. The rest of the series will be more professional and will contain plenty of technical language.
Original link
Welcome to join the official BlockBeats community:
Telegram Subscription Group: https://t.me/theblockbeats
Telegram Discussion Group: https://t.me/BlockBeats_App
Official Twitter Account: https://twitter.com/BlockBeatsAsia