Flash Depth Data

Search Extension

Understand blockchain latency and throughput

2022-11-17 18:30

Read this article in 16 Minutes

This article explores a way to accurately measure blockchain systems

Source: paradigm.xyz

Original article by Lefteris Kokoris-Kogias

ETH Chinese

Not much is said about how to properly measure a [blockchain] system, but it is the most important step in the system design and evaluation process. There are many consensus protocols, various performance variables, and tradeoffs for scalability.

Until now, however, there has been no reliable way for everyone to agree on a reasonable comparison within the same category of apples versus apples. In this article, we will outline an approach inspired by the measurement mechanisms of data-centric systems and explore some common mistakes that can be avoided when evaluating a blockchain system.

Key indicators and their interactions

When developing blockchain systems, we should keep two important metrics in mind: latency and throughput.

The first thing users care about is transaction latency, that is, the time between initiating a transaction or payment and receiving information confirming the validity of the transaction (for example, confirming that the transaction originator has enough money).

In traditional BFT systems (PBFT, Terdermint, Tusk, Narwhal, etc.), trades are finalized as soon as they are confirmed, whereas in the longest chain Consensus mechanism (Nakamoto Consensus, Solana/Ethereum PoS), A deal might be packaged into a block and then restructured. As a result, we had to wait until the transaction was "k blocks deep" to finalize, resulting in delays that exceeded the time required for a single confirmation.

Second, the throughput of a system is generally important to the system designer.This is the total load handled by the system per unit of time, commonly expressed as transactions per second (TPS).

At first glance, these two key indicators appear to be diametrically opposed things. But because throughput is measured by transactions per second, latency is measured in seconds. Naturally, we think of throughput = load/latency.

But that's not the case. Because many systems tend to generate graphs that show throughput or latency on the Y-axis and the number of nodes on the X-axis, implementation of this calculation is not possible. Instead, we can produce a better graph with throughput/latency metrics that are presented in a non-linear manner to make the graph clear and easy to read.

When there is no contending, latency is constant and throughput can be changed simply by changing the load on the system. This happens because in the low-contention case, the minimum overhead of sending a transaction is fixed and the queue latency is zero, resulting in "whatever comes in, goes straight out."

In a competitive situation, throughput is constant, but latency can vary simply by changing the load.

This is because the system is already overloaded, and adding more load causes wait queues to get infinitely longer. Even more perversely, the delay seemed to vary with the length of the experiment, an artificial consequence of an infinitely growing queue.

These manifestations can be seen on a typical "hockey graph" or "L-shaped graph", depending on the distribution of arrival intervals (discussed below). So,The key takeaway from this article is that we should take measurements in the hot zone, where both throughput and latency affect our baseline; Instead of measuring the edge region, only one of the throughput and latency matters here.

Methodology of measurement

When conducting experiments, experimenters have three main design options:

Open loop vs. closed loop

There are two main ways to control the flow of requests to a target. Open-loop systems are modeled based on n = clients that send requests to targets based on rate and arrival interval distributions (for example, Poisson). A closed-loop system limits the number of outstanding requests at any given time. The difference between an open-loop system and a closed-loop system is a characteristic of a particular deployment, where the same system can be deployed in different scenarios.

For example, a key-value store can serve thousands of application servers in an open-loop deployment, or only a few blocking clients in a closed-loop deployment.

Testing for the right deployment scenario is essential because latency is often limited by the number of potential outstanding requests than in closed-loop systems, which can generate large waiting queues and therefore have longer latency. In general, blockchain protocols can be used by any number of clients, so it is more accurate to evaluate them in an open-loop environment.

The arrival interval distribution of the comprehensive benchmark

When creating a composite workload, we inevitably ask: How do you submit a request to the system? Many systems preload transactions before measuring, but this can skew measurements because the system runs from exception state 0. In addition, the pre-loaded request is already in main storage, thus bypassing its network stack.

A better approach is to send requests at a fixed rate (say, 1000 TPS), which results in an L-shaped graph (orange line) because the system's capacity is being optimised.

However, open systems tend not to work in a predictable way. Instead, they have periods of high and low load. To model this, we can use the probability interval distribution, which is generally based on the Poisson distribution. It will cause the "hockey" chart (blue line) because Poisson bursts cause some queuing delays (maximum capacity) even if the average rate is below the optimal value.But it was advantageous for us because we could see how the system handled the high load and how quickly it recovered when the load returned to normal.

Warm up stage

The final point to consider is when to start measuring. We want the assembly line to be full of business before it starts; Otherwise, the warm-up delay will need to be measured. Ideally, the measurement of the warm-up delay should be done by measuring the delay during the warm-up phase until the measurement follows the expected distribution.

How to compare

The final challenge is to properly compare the various deployments of the system. Again, the difficulty is that latency and throughput are interdependent, so it may be difficult to generate a fair throughput/node count graph.

The best way is to defineService level objectives (SLO) and measure the throughput at the time, rather than simply pushing each system to its maximum throughput (where latency is meaningless). A good way to visualize is to draw a horizontal line on the throughput/delay graph where the SLO intersects the delay axis and sample the intersection points.

But I set the SLO for 5 seconds and it only takes 2 seconds.

One might want to increase the load here to take advantage of the slightly higher throughput available after the saturation point. But it's dangerous. If the system operations are inadequately configured, an unexpected burst of requests can cause the system to reach full saturation, causing delays to spike and quickly violating the SLO. Essentially, running after the saturation point leads to an unstable equilibrium.

So there are two things to consider:

1. Over provisioning system. Essentially, the system should run below the saturation point in order to absorb outbreaks that reach the interval distribution without causing queuing delays to increase.

2. If there is space under the SLO, increase the batch size. This increases the load on the system's critical path without increasing queuing latency, and it gives you the higher throughput to get the higher latency tradeoff you want.

I'm generating a huge load. How do I measure latency?

When the load on the system is high, trying to access the local clock and add a timestamp for each transaction that reaches the system can skee the results.

Instead, there are two more viable options. The first and simplest approach is to sample transactions; For example, there may be a magic number in some transactions for which the client reserves a timer. After the commit time, anyone can check the blockchain to determine when these transactions were committed and thus calculate their latency. The main advantage of this approach is that it does not interfere with the arrival interval distribution. However, it may be considered "hacky" because some transactions must be modified.

A more systematic approach would be to use two load generators. The first is the primary load generator, which follows the Poisson distribution. The second request generator is used to measure latency and has a much lower load; You can think of this request generator as a single client compared to the rest of the system. Even if the system sends a reply to every request (as some systems do, such as a key-value store), we could easily put all the replies into the load generator and measure only the latency from the request generator.

The only tricky part is that the actual arrival interval distribution is the sum of two random variables; However, the sum of the two Poisson distributions is still a Poisson distribution, so the math is not hard:).

conclusion

Measuring large-scale distributed systems is critical to identifying bottlenecks and analyzing expected behavior under stress. Hopefully, by using the above methods, we can all take the first steps toward a common language that will ultimately make blockchain systems more applicable to the work they do and the promises they make to their end users.

In future work, we plan to apply this approach to the existing consensus mechanism, so if you're interested, please check in on Twitter!

Thanks: All of these are my work in designing and implementing Narwhal & Lessons learned during Tusk (Best Paper Award @Eurosys 2022) with my co-authors, as well as previous comments on the draft by Marios Kogias, Joachim Neu, Georgios Konstantopoulos and Dan Robinson.

Link to original article

Welcome to join the official BlockBeats community:

Telegram Subscription Group: https://t.me/theblockbeats

Telegram Discussion Group: https://t.me/BlockBeats_App

Official Twitter Account: https://twitter.com/BlockBeatsAsia

#Block chain #Blockchain technology

Correction/Report

This platform has fully integrated the Farcaster protocol. If you have a Farcaster account, you canLogin to comment