Blockchain reward sharing - a comparative systematization from first principles
Navigating the diverse landscape of reward-sharing schemes and the choices we have made in the design of Cardano’s reward-sharing scheme
30 November 2020 Prof Aggelos Kiayias 10 mins read
In the previous article, we identified the objectives of the reward scheme in Cardano, and we gave general guidelines regarding engaging with the system.
Taking a more high-level view, we will examine from first principles, the general problem of reward sharing in blockchain systems. To recall, the two overarching objectives of any resource-based consensus system is to incentivize the following.
High engagement. Resource-based consensus protocols are more secure the more resources are engaged with protocol maintenance. The problem, of course, is that the underlying resources are useful for a wide variety of other things too (e.g., electricity and computational power in the case of proof of work, or stake for engaging in decentralized apps in the case of proof of stake), so resource holders should be incentivized to commit resources for protocol maintenance.
Low leverage: leverage relates to decentralization. Take a group of 10 people; if there is a leader and the group follows the leader’s wishes all the time, the leader’s leverage is 10 while everyone else’s is zero. If, on the other hand, everyone’s opinion matters the same, everyone’s leverage is 1. These are two extremes, but it should be fairly obvious what types of leverage align better with decentralization. From an economic viewpoint, however, a “benevolent dictatorship” is always more efficient; as a result, decentralization will come at a cost (exactly as democracy does), and hence it has to be also properly incentivized.
Given the above objectives, let us now examine some approaches that have been considered in consensus systems and systematize them in terms of how they address the above objectives. An important first categorization we will introduce is between unimodal and multimodal reward schemes.
In a unimodal scheme, there is only one way to engage in the consensus protocol with your resources. We examine two sub-categories of unimodal schemes.
- Linear Unimodal
This is the simplest approach and is followed by many systems, notably Bitcoin; the original proof-of-work based Ethereum, as well as Algorand. The idea is simple: if an entity commands x% resources, then the system will attempt to provide x% of the rewards – at least in expectation. This might seem fair—until one observes the serious downsides that come with it.
First, consider that someone has x% of resources and that x% of the rewards in expectation are below their individual cost to operate as a node. Then, they will either not engage (lowering the engagement rate of the system), or, more likely, actively seek others to combine resources and create a node. Even if there are two resource holders with x% of resources each and a viable individual cost c when running as separate nodes, they will fare better by combining resources into a single node of 2x% resources because the resulting cost will be typically less than 2c. This can result in a strong trend to centralize, and lead to high leverage since the combined pool of resources will be (typically) run by one entity.
In practice, a single dictatorially operated node is unlikely to emerge. This is due to various reasons such as friction in coordination between parties, fear of the potential drop in the exchange rate of the system’s underlying token if the centralization trend becomes noticeable, as well as the occasional use of complex protocols to jointly run pools. Even so, it is clear that unimodal linear rewards can hurt decentralization.
One does not need to go much further than looking at Bitcoin and its current, fairly centralized, mining pool lineup. It is worth noting that if stake (rather than hashing power) is used as a resource, the centralization pressure will be less – since the expenditure to operate a node is smaller. But the same problems apply in principle.
An additional disadvantage of the above setting is that the ensuing “off-chain” resource pooling that occurs will be completely opaque from the ledger perspective, and hence more difficult for the community to monitor and react to. In summary, the linear unimodal approach has the advantage of being simple, but is precarious, both in terms of increasing engagement and for keeping leverage low.
- Quantized Linear Unimodal
This approach is the same as the linear rewards approach, but it quantizes the underlying resource. I.e., if your resources are below a certain threshold, you may be completely unable to participate; you can only participate in fixed quanta. Notably, this approach is taken in ETH2.0, where 32 Ether should be pledged in order to acquire a validator identity. It should be clear that this quantized approach shares the same problems with the linear unimodal approach in terms of participation and leverage. Despite this, it has been considered for two primary reasons. First, using the quantized approach enables one to retrofit traditional BFT-style protocol design elements (e.g. that require counting identities) in a resource-based consensus setting. The resulting system is less elegant than true resource-based consensus but this is unavoidable since traditional BFT-style protocols do not work very well when there are more than a few hundred nodes involved. The second reason, specific to the proof-of-stake setting, is seeking to impose penalties on participants as a means of ensuring compliance with the protocol. Imposing quantized collateral pledges makes penalties for protocol infractions more substantial and painful.
We next turn to multimodal schemes. This broad category includes Cosmos, Tezos, Polkadot & EOS. It also includes Cardano. In a multimodal scheme, a resource holder may take different roles in the protocol; being a fully active node in the consensus protocol is just one of the options. The advantage of a multimodal scheme is that offering multiple ways to engage (with correspondingly different rates of return) within the protocol itself can accommodate a higher engagement, as well as limit off-chain resource pooling. For instance, if the potential rewards received by an individual when they engage with all their resources sit below their operational cost of running a node, they can still choose to engage by a different mode in the protocol. In this way, the tendency to combine resources off-chain is eased and the system – if designed properly – may translate this higher engagement to increased resilience.
We will distinguish between a number of different multimodal schemes.
- Representative bimodal without leverage control. The representative approach is inspired by representative democracy: the system is run by a number of elected operators. The approach is bimodal as it enables parties to (1) advertise themselves as operators in the ledger and/or (2) “vote” for operators with their resources. The set of representative operators has a fixed size and is updated on a rolling basis typically with fixed terms using some election function that selects representatives based on the votes they received. Rewards are distributed evenly between representatives, possibly taking into account performance data and adjusting accordingly. Allowing rewards to flow to voters using a smart contract can incentivize higher engagement in voting since resource holders get paid for voting for good representatives (note that this is not necessarily followed by all schemes in this category). The disadvantage of this approach is the lack of leverage control, beyond, possibly, the existence of a very large upper bound, which suggests that the system may end up with a set of very highly leveraged operators. This is the approach that is broadly followed by Cosmos, EOS, and Polkadot.
A different approach to the representative approach is the delegative approach. In general, this approach is closer to direct democracy as it allows resource holders the option to engage directly with the protocol with the resources they have. However, they are free to also delegate their resources to others as in liquid (or delegative) democracy (where the term delegative is derived from). This results in a community-selected operator configuration that does not have a predetermined number of representatives. As in the representative approach, user engagement is bimodal. Resource holders can advertise themselves as operators and/or delegate their resources to existing operators. The rewards provided are proportional to the amount of delegated resources and delegates can be paid via an on-chain smart contract, perhaps at various different rates. Within the delegative approach we will further distinguish two subcategories.
- Delegative bimodal with pledge-based capped rewards. What typifies this particular delegative approach is that the resource pool’s rewards have a bound that is determined by the amount of pledge that is committed to the pool by its operator. In this way, the total leverage of an operator can be controlled and fixed to a constant. Unfortunately, this leverage control feature has the negative side effect of implicitly imposing the same bound to all, small and large resource holders. So, on the one hand, in a population of small resource holders, engagement will be constrained by the little pledge that operators are able to commit. On the other hand, a few large whale resource holders may end up influencing the consensus protocol in a very significant manner, possibly even beyond its security threshold bound. In terms of leverage control, it should be clear that one size does not fit all! From existing systems, this is the approach that is (in essence) followed by Tezos.
It is worth noting that all the specific approaches we have seen so far come with downsides – either in terms of maximizing engagement, controlling leverage, or both. With this in mind, let us now fit into our systematization, the approach of the reward-sharing scheme that we are using in Cardano.
- Delegative bimodal with capped rewards and incentivized pledging. In this delegative system (introduced in our reward-sharing scheme paper), the rewards that are provided to each pool follow a piecewise function on the pool’s size. The function is initially monotonically increasing and then becomes constant at a certain “cap” level which is a configurable system parameter (in Cardano this is determined by the parameter k). This cap limits the incentives to grow individual resource pools. At the same time, pledging resources to a pool is incentivized with higher pledged pools receiving more rewards. As a result, lowering one’s leverage becomes incentive-driven: resource pools have bounded size and operators have an incentive to pledge all the resources they can afford into the smallest number of pools possible. In particular, whale resource holders are incentivized to keep their leverage low. The benefit of the approach is that high engagement is reinforced, while leverage is kept in control by incentivizing the community to (i) pledge as much as possible, (ii) use all the remaining unpledged resources as part of a crowdsourced filtering mechanism. This translates stake to voting power and supports exactly those operators that materially contribute to the system’s goals the most.
The above systematization puts into perspective the choices that we have made in the design of the reward-sharing scheme used in Cardano vis-a-vis other systems. In summary, what the Cardano reward system achieves is to materially promote with incentives and community stake-based voting the best possible outcome: low leverage and high engagement. And this is accomplished, while still allowing for a very high degree of heterogeneity in terms of input behavior from the stakeholders.
As a final point, it is important to stress that while considerable progress has been made since the introduction of the Bitcoin blockchain, research in reward sharing for collaborative projects is still an extremely active and growing domain. Our team continuously evaluates various aspects of reward-sharing schemes and actively explores the whole design space in a first-principles manner. In this way, we can ensure that any research advances will be disseminated widely for the benefit of the whole community.
I am grateful to Christian Badertscher, Sandro Coretti-Drayton, Matthias Fitzi, and Peter Gaži, for their help in the review of other systems and their placement in the systematization of this article.
As a project, decentralization remains arguably our most important and fundamental goal for Cardano. Protocols and parameters provide the foundations for any blockchain. Last week, we outlined some of the planned changes around Cardano parameters and how these will impact the staking ecosystem and thus accelerate our decentralization mission.
Yet the community itself – how it sees itself, how it behaves, and how it sets common standards – is a key factor in the pace of this success. Cardano has been very carefully engineered to provide “by design” all the necessary properties for a blockchain system to operate successfully. However, Cardano is also a social construct, and as such, observance, interpretation, and social norms play a crucial role in shaping its resilience and longevity.
So in anticipation of the k-parameter adjustment on December 6th, I would like to give a broader perspective on staking, highlighting some of the innovative features of the rewards sharing scheme used in Cardano.
Principles & practical intent
As well as outlining some of the key principles, this piece has a clear practical intent; to provide guidance and some recommendations to stakeholders so that they engage meaningfully with the mechanism, and support the project’s longer-term strategic goals through their actions.
Consensus based on a resource that is dispersed somehow across a population of users – as opposed to identity-based participation – has been the hallmark of the blockchain space since the launch of the Bitcoin blockchain. In this domain, proof-of-stake systems are distinguished in the sense that they use a virtual resource, stake, which is recorded in the blockchain itself.
Pooling resources for participation is something that is inevitable; some level of pooling is typically beneficial in the economic sense and hence resource holders will find a way to make it happen. Given this inevitability, the question arises: how does a system prevent a dictatorship or an oligarchy from emerging?
The objectives of the reward sharing scheme
Contrary to other blockchain systems, Cardano uses a reward sharing scheme that (1) facilitates staking with minimum friction as well as (2) it incentivizes pooling resources in a way that system-wide decentralization emerges naturally from the rational engagement of the resource holders.
The mechanism has the following two broad objectives:
- Engage all stakeholders - This is important since the more stakeholders are engaged in the system, the more secure the distributed ledger will be. This also means that the system should have no barriers for participation, nor should impose friction by requiring off-chain coordination between stakeholders to engage with the mechanism.
- Keep the leverage of individual stakeholders low -. Pooling resources leads to increased leverage for some stakeholders. Pool operators exert an influence in the system proportional to the resources controlled by their pool, not to their own resources. Without pooling, all resource holders have leverage of exactly 1; contrast this e.g., to a pool operator, owning, say 100K ada, who controls a pool of total delegated stake of 200M ada; that operator has leverage of 2,000. The higher the leverage of the system, the worse its security (to see this, consider that with leverage above 50, launching a 51% attack requires a mere 1% of the total resources!).
It should also be stressed that a disproportionately large pool size is not the only reason for increased leverage; stakeholders creating multiple pools, either openly or covertly (what is known as a Sybil attack) can also lead to increased leverage. The lower the leverage of a blockchain system, the higher its degree of decentralization.
Putting this into practice
So how does the reward sharing scheme used in Cardano meet the above objectives? Staking via our scheme facilitates two different paths: pledging and delegating. Pledging applies to stake pool operators; pledged stake is committed to a stake pool and is supposed to stay put for as long as the pool is operating. Think of pledge as a ‘commitment’ to the network – ‘locking up’ a certain amount of stake in order to help safeguard and secure the protocol. Delegating on the other hand, is for those who do not wish to be involved as operators. Instead, they are invited to assess the offerings the stake pool operators provide, and delegate their stake to one or more pools that, in their opinion, best serve their interests and the interest of the community at large. Given that delegation does not require locking up funds, there is no reason to abstain from staking in Cardano; all stakeholders can and are encouraged to engage in staking.
Central to the mechanism’s behavior are two parameters: k and a0. The k-parameter caps the rewards of pools to 1/k of the total available. The a0 parameter creates a benefit for pledging more stake into a single pool; adding X amount of pledge to a pool increases its rewards additively by up to a0*X. This is not to the detriment of other pools; any rewards left unclaimed due to insufficient pledging will be returned to the Cardano’s reserves and allocated in the future.
Beyond deciding on an amount to pledge, creating a stake pool requires that operators declare their profit margin and operational costs. When the pool rewards are allocated at the end of each epoch, the operational costs are withheld first, ensuring that stake pools remain viable. Subsequently, operator profit is calculated, and all pool delegators are rewarded in ada proportional to their stake afterwards.
Paired with the assessment of stake pools performed by the delegates, this mechanism provides the right set of constraints for the system to converge to a configuration of k equal size pools with the maximum amount of pledge possible. The equilibrium point has the property that delegator rewards are equalized (so it doesn’t matter what pool they delegate to!), while stake pool operators are rewarded appropriately for their performance, their cost efficiency, and their general contributions to the ecosystem.
For the above to happen, it is necessary to engage with the mechanism in a meaningful and rational manner. To assist stakeholders in understanding the mechanism, here are some points of advice.
Guidance for delegators
- Know your pool(s) - Investigate the pools’ available data and information. What is the operators’ web-presence? What kind of information do they provide about their operation? Are they descriptive about their costs? Are the costs reasonably based on geographic location and other aspects of their operation? Do they update their costs regularly to account for the fluctuation of ada? Do they include the costs for their personal time? Remember that maintaining a high-performance pool requires commitment and effort, so those committed operators deserve compensation.
- Think bigger - Consider your choice holistically, not based on just a single dimension. Consider the longer term value your choices bring to the network. Think of your delegation as a ‘vote of confidence’, or a way to show your support to a pool's mission or goals. Opt for professionalism and demonstrated long-term commitment to the system’s goals. Recognize community members who have been helping to lay down the foundations for the ecosystem, either with their community presence or by helping to build things. The long-term wellbeing of the ecosystem is crucially affected by your delegation choice. A more decentralized network is a more resilient and long-lived network.
- Be wary of ‘pool splitters’ - Pool operators that run multiple pools with small pledge hurt delegators and smaller operators. They hurt their delegators because they could have provided a higher amount of rewards by concentrating their pledge into a single pool; by not doing that, there are rewards that remain unclaimed. They hurt smaller and new operators, because they are forcing them to remain without delegates and hence making their operation unviable – without delegates a pool may be forced to close. So avoid pool operators that run multiple pools with pledge below saturation level. Note there are legitimate reasons for large stakeholders to accept delegators and run a public pool (e.g., they are delegating some of their stake to other pools to support the community); consult any public statements such operators make about their delegation strategy and their leverage. It is ok to delegate to them, assuming they keep their leverage low and they support the community.
- Be wary of highly leveraged operators - Be mindful of the stake pool operators’ leverage (see below for more details on how to calculate leverage). A higher pledge is correlated to less leverage when comparing pools of the same size; a high leverage is indicative of a stake pool operator with very little “skin in the game.” Stake pool operators may prove to have skin in the game in other ways than pledging stake of course; e.g., they can be very professional and contribute to the community in different ways. You should be the judge of this: high leverage in itself is not a reason to avoid delegating to a particular pool, but it is a strong indication that you should proceed with caution and carefully evaluate the people behind the operation.
- Shop around - Do take into account the information provided from your wallet software (or from recognized community resources such as adapools or pooltool) in terms of the pool’s ranking and its performance factor. Remember though, while the ranking is important, it should not be the sole factor behind your delegation choice. Think holistically – you may want to consider pools fulfilling a mission you agree with, or trying to add value to the wider community through podcasts or social activity, even if they do not offer the highest possible returns.
- Be involved - A pool with no performance data on display may have attractive characteristics; it could be providing better rewards in the best case scenario, but also high risk as a delegation choice since its performance may turn out to be suboptimal. Delegate according to your ‘risk profile’, and the frequency you are willing to re-delegate your stake. Do check the pool’s performance and updates regularly to ensure that your choice and assessment remains the best possible.
Guidance for pool operators
- Be transparent - Choose your pool’s operational cost as accurately as possible. Do include the personal effort (priced at a reasonable rate) that you and your partners put into the pool operation! You are a pillar of Cardano and so you have every right to be compensated by the community. Be upfront about your costs and include them in your pool’s website. Educate your prospective delegates about where the pool costs are going. Always remember that it is important to charge for the time you invest in maintaining your pool. In the short term, you may be prepared to invest your time and energy ‘for free’ (or after hosting costs, at an effective loss) but remember that this is not a sustainable model for the network over the medium and longer term.
Don’t split your pool - With the coming changes in k (commencing with the move to k=500 on 6th December), we are already seeing pool operators splitting their pools in order to retain delegators without becoming saturated. Do not engage in pool splitting unless you can saturate a pool completely with your stake. If you are a whale (relative stake > 1/k) you can create multiple pools – but you should keep your leverage as close to 1 as possible or less. Pool splitting that increases your leverage hurts the delegators’ rewards, and more importantly, it hurts the decentralization of the Cardano ecosystem, which is detrimental to everyone. If you run and control multiple pools under different tickers, make a public statement about it. Explain the steps you take to control your leverage. Creating multiple pools while trying to conceal the fact that you control them is akin to a Sybil attack against Cardano. This behavior should be condemned by the community. You can calculate and publicize your leverage using the following formula:
Exchanges are a special kind of whale stakeholder, since they collectively manage other people’s stake. One strategy for an exchange is to avoid leverage altogether and delegate the stake they control to community pools. If an exchange becomes a pool operator, they can maintain their leverage below 1 by using a mixed pledging and delegation strategy.
- Set your profit margin wisely - Select the margin to make your pool competitive. Remember that if everyone delegates their stake and is rational, you only have to beat the (k+1)-th pool in the rankings offered by the Daedalus wallet. If your pool offers other advantages that can attract delegation (e.g., you are contributing to a charitable cause you feel others may wish to support), or you have acquired exceptional equipment that promises notable uptime/performance, make sure you promote this widely. When you offer such benefits, you should consider setting a higher profit margin.
- Keep your pool data updated - Regularly update the cost and margin to accommodate fluctuations in ada price. Give assurances to your delegators and update them about the stake pool operational details. In case of mishaps and downtimes, be upfront and inform your delegators via your website and/or other communication channels you maintain with them.
- Pledge as much as you are able to - Increase the amount of pledge as much as you comfortably can and not more. Beyond using your own stake, you can also partner with other stakeholders to increase the pledge of your pool. A high pledge signals long-term commitment and reduced leverage, and it unlocks additional rewards every epoch as dictated by the a0 term in the rewards sharing scheme calculation. As a result, it does make your pool more desirable to prospective delegators. On the other hand, remember that pledge is not the only factor that makes a pool attractive. Spend time on your web and social media presence and be sure to advertise all the ways that you contribute to the Cardano ecosystem.
If you are a Cardano stakeholder, we hope that you find the above advice informative and helpful in your efforts to engage in staking. As in many other respects, Cardano brings a novel and heavily researched mechanism to its blockchain design. The rewards scheme is mathematically proven to offer an equilibrium that meets the set of objectives set out in the beginning of this document. Ultimately though, the math is not enough; it is only the people that can make it happen.
Cardano’s future is in the hands of the community.
The opinions expressed in the blogpost are for educational purposes only and are not intended to provide any form of financial advice.
Designing and deploying a distributed ledger is a technically challenging task. What is expected of a ledger is the promise of a consistent view to all participants as well as a guarantee of responsiveness to the continuous flow of events that result from their actions. These two properties, sometimes referred to as persistence and liveness, are the hallmark of distributed ledger systems.
Achieving persistence and liveness in a centralized system is a well-studied and fairly straightforward task; unfortunately, the ledger that emerges is precariously brittle because the server that supports the ledger becomes a single point of failure. As a result, hacking the server can lead to the instant violation of both properties. Even if the server is not hacked, the interests of the server’s operators may not align with the continuous assurance of these properties. For this reason, decentralization has been advanced as an essential remedy.
Informally, decentralization refers to a system architecture that calls for many entities to act individually in such a way that the ledger’s properties emerge from the convergence of their actions. In exchange for this increase in complexity, a well-designed system can continue to function even if some parties deviate from proper operation. Moreover, in the case of more significant deviations, even if some disruption is unavoidable, the system should still be capable of returning to normal operation and contain the damage.
How does one design a robust decentralized system? The world is a complicated place and decentralization is not a characteristic that can be hard-coded or demonstrated via testing – the potential configurations that might arise are infinite. To counter this, one must develop models that systematically encompass all the different threats the system may encounter and demonstrate rigorously that the two basic properties of persistence and liveness are upheld.
The strongest arguments for the reliability of a decentralized system combine formal guarantees against a broad portfolio of different classes of failure and attack models. The first important class is that of powerful Byzantine models. In this setting, it should be guaranteed that even if a subset of participants arbitrarily deviate from the rules, the two fundamental properties are retained. The second important class is models of rationality. Here, participants are assumed to be rational utility maximizers and the objective is to show that the ledger properties arise from their efforts to pursue their self interest.
Ouroboros is a decentralized ledger protocol that is analyzed in the context of both Byzantine and rational behavior. What makes the protocol unique is the combination of the following design elements.
- It uses stake as the fundamental resource to identify the participants’ leverage in the system. No physical resource is wasted in the process of ledger maintenance, which is shown to be robust despite ‘costless simulation’ and ‘nothing at stake’ attacks that were previously thought to be fundamental barriers to stake-based ledgers. This makes Ouroboros distinctly more appealing than proof-of-work protocols, which require prodigious energy expenditure to maintain consensus.
- It is proven to be resilient even if arbitrarily large subsets of participants, in terms of stake, abstain from ledger maintenance. This guarantee of dynamic availability ensures liveness even under arbitrary, and unpredictable, levels of engagement. At the same time, of those participants who are active, barely more than half need to follow the protocol – the rest can arbitrarily deviate; in fact, even temporary spikes above the 50% threshold can be tolerated. Thus Ouroboros is distinctly more resilient and adaptable than classical Byzantine fault tolerance protocols (as well as all their modern adaptations), which have to predict with relative certainty the level of expected participation and may stop operating when the prediction is false.
- The process of joining and participating in the protocol execution is trustless in the sense that it does not require the availability of any special shared resource such as a recent checkpoint or a common clock. Engaging in the protocol requires merely the public genesis block of the chain, and access to the network. This makes Ouroboros free of the trust assumptions common in other consensus protocols whose security collapses when trusted shared resources are subverted or unavailable.
- Ouroboros incorporates a reward-sharing mechanism to incentivize participants to organize themselves in operational nodes, known as stake pools, that can offer a good quality of service independently of how stake is distributed among the user population. In this way, all stakeholders contribute to the system’s operation – ensuring robustness and democratic representation – while the cost of ledger maintenance is efficiently distributed across the user population. At the same time, the mechanism comes with countermeasures that de-incentivize centralization. This makes Ouroboros fundamentally more inclusive and decentralized compared with other protocols that either end up with just a handful of actors responsible for ledger maintenance or provide no incentives to stakeholders to participate and offer a good quality of service.
These design elements of Ouroboros are not supposed to be self-evident appeals to the common sense of the protocol user. Instead, they were delivered with meticulous documentation in papers that have undergone peer review and appeared in top-tier conferences and publications in the area of cybersecurity and cryptography. Indeed, it is fair to say that no other consensus research effort is represented so comprehensively in these circles. Each paper is explicit about the specific type of model that is used to analyze the protocol and the results derived are laid out in concrete terms. The papers are open-access, patent-free, and include all technical details to allow anyone, with the relevant technical expertise, to convince themselves of the veracity of the claims made about performance, security, and functionality.
Building an inclusive, fair and resilient infrastructure for financial and social applications on a global scale is the grand challenge of information technology today. Ouroboros contributes, not just as a protocol with unique characteristics, but also in presenting a design methodology that highlights first principles, careful modeling and rigorous analysis. Its modular and adaptable architecture also lends itself to continuous improvement, adaptation and enrichment with additional elements (such as parallelization to improve scalability or zero-knowledge proofs to improve privacy, to name two examples), which is a befitting characteristic to meet the ever-evolving needs and complexities of the real world.
To delve deeper into the Ouroboros protocol, from its inception to recent new features, follow these links:
- Ouroboros (Classic): the first provably secure proof-of-stake blockchain protocol.
- Ouroboros Praos: removes the need for a rigid round structure and improves resilience against ‘adaptive’ attackers.
- Ouroboros Genesis: how to avoid the need for a recent checkpoint and prove the protocol is secure under dynamic availability for trustless joining and participating.
- Ouroboros Chronos: removes the need for a common clock.
- Reward sharing schemes for stake pools.
- Account management and maximizing participation in stake pools.
- Optimizing transaction throughput with proof-of-stake protocols.
- Fast settlement using ledger combiners.
- Ouroboros Crypsinous: a privacy-preserving proof-of-stake protocol.
- Kachina: a unified security model for private smart contracts.
- Hydra: an off-chain scalability architecture for high transaction throughput with low latency, and minimal storage per node.
Enter the Hydra: scaling distributed ledgers, the evidence-based way
Learn about Hydra: the multi-headed ledger protocol
26 March 2020 Prof Aggelos Kiayias 10 mins read
Scalability is the greatest challenge to blockchain adoption. By applying a principled, evidence-based approach, we have arrived at a solution for Cardano and networks similar to it: Hydra. Hydra is the culmination of extensive research, and a decisive step in enabling decentralized networks to securely scale to global requirements.
What is scalability and how do we measure it?
Scaling a distributed ledger system refers to the capability of providing high transaction throughput, low latency, and minimal storage per node. These properties have been repeatedly touted as critical for the successful deployment of blockchain protocols as part of real-world systems. In terms of throughput, the VISA network reportedly handles an average of 1,736 payment transactions per second (TPS) with the capability of handling up to 24,000 TPS and is frequently used as a baseline comparison. Transaction latency is clearly desired to be as low as possible, with the ultimate goal of appearing instantaneous to the end-user. Other applications of distributed ledgers have a wide range of different requirements in terms of these metrics. When designing a general purpose distributed ledger, it is natural to strive to excel on all three counts.
Deploying a system that provides satisfactory scaling for a certain use case requires an appropriate combination of two independent aspects: adopting a proper algorithmic design and deploying it over a suitable underlying hardware and network infrastructure.
When evaluating a particular algorithmic design, considering absolute numbers in terms of specific metrics can be misleading. The reason is that such absolute quantities must refer to a particular underlying hardware and network configuration which can blur the advantages and disadvantages of particular algorithms. Indeed, a poorly designed protocol may still perform well enough when deployed over superior hardware and networking.
For this reason, it is more insightful to evaluate the ability of a protocol to reach the physical limits of the underlying network and hardware. This can be achieved by comparing the protocol with simple strawman protocols, in which all the design elements have been stripped away. For instance, if we want to evaluate the overhead of an encryption algorithm, we can compare the communication performance of two end-points using encryption against their performance when they simply exchange unencrypted messages. In such an experiment, the absolute message-per-second rate is unimportant. The important conclusion is the relative overhead that is added by the encryption algorithm. Moreover, in case the overhead approximates 0 for some configuration of the experimental setup, we can conclude that the algorithm approximates the physical limits of the underlying network’s message-passing ability for that particular configuration, and is hence optimal in this sense.
Hydra – 30,000-feet view
Hydra is an off-chain scalability architecture for distributed ledgers, which addresses all three of the scalability challenges mentioned above: high transaction throughput, low latency, and minimal storage per node. While Hydra is being designed in conjunction with the Ouroboros protocol and the Cardano ledger, it may be employed over other systems as well, provided they share the necessary salient characteristics with Cardano.
Despite being an integrated system aimed at solving one problem – scalability – Hydra consists of several subprotocols. This is necessary as the Cardano ecosystem itself is heterogenous and consists of multiple entities with differing technical capabilities: the system supports block producers with associated stake pools, high-throughput wallets as used by exchanges, but also end-users with a wide variety of computational performance and availability characteristics. It is unrealistic to expect that a one-shoe-fits-all, single-protocol approach is sufficient to provide overall scalability for such a diverse set of network participants.
The Hydra scalability architecture can be divided into four components: the head protocol, the tail protocol, the cross-head-and-tail communication protocol, as well as a set of supporting protocols for routing, reconfiguration, and virtualization. The centerpiece is the 'head' protocol, which enables a set of high-performance and high-availability participants (such as stake pools) to very quickly process large numbers of transactions with minimal storage requirements by way of a multiparty state channel – a concept that generalizes two-party payment channels as implemented in the context of the Lightning network. It is complemented by the 'tail' protocol, which enables those high-performance participants to provide scalability for large numbers of end-users who may use the system from low-power devices, such as mobile phones, and who may be offline for extended periods of time. While heads and tails can already communicate via the Cardano mainchain, the cross-head-and-tail communication protocol provides an efficient off–chain variant of this functionality. All this is tied together by routing and configuration management, while virtualisation facilitates faster communication generalizing head and tail communication.
The Hydra head protocol
The Hydra head protocol is the first component of the Hydra architecture to be publicly released. It allows a set of participants to create an off-chain state channel (called a head) wherein they can run smart contracts (or process simpler transactions) among each other without interaction with the underlying blockchain in the optimistic case where all head participants adhere to the protocol. The state channel offers very fast settlement and high transaction throughput; furthermore, it requires very little storage, as the off-chain transaction history can be deleted as soon as its resulting state has been secured via an off–chain 'snapshot' operation.
Even in the pessimistic case where any number of participants misbehave, full safety is rigorously guaranteed. At any time, any participant can initiate the head's 'closure' with the effect that the head's state is transferred back to the (less efficient) blockchain. We emphasize that the execution of any smart contracts can be seamlessly continued on-chain. No funds can be generated off-chain, nor can any single, responsive head participant lose any funds.
The state channels implemented by Hydra are isomorphic in the sense that they make use of the same transaction format and contract code as the underlying blockchain: contracts can be directly moved back and forth between channels and the blockchain. Thus, state channels effectively yield parallel, off-chain ledger siblings. In other words, the ledger becomes multi-headed.
Transaction confirmation in the head is achieved in full concurrency by an asynchronous off-chain certification process using multi-signatures. This high level of parallelism is enabled by use of the extended UTxO model (EUTxO). Transaction dependencies in the EUTxO model are explicit, which allows for state updates without unnecessary sequentialization of transactions that are independent of each other.
Experimental validation of the Hydra head protocol
As a first step towards experimentally validating the performance of the Hydra head protocol, we implemented a simulation. The simulation is parameterized by the time required by individual actions (validating transactions, verifying signatures, etc.), and carries out a realistic and timing-correct simulation of a cluster of distributed nodes forming a head. This results in realistic transaction confirmation time and throughput calculations.
We see that a single Hydra head achieves up to roughly 1,000 TPS, so by running 1,000 heads in parallel (for example, one for each stake pool of the Shelley release), we should achieve a million TPS. That’s impressive and puts us miles ahead of the competition, but why should we stop there? 2,000 heads will give us 2 million TPS – and if someone demands a billion TPS, then we can tell them to just run a million heads. Furthermore, various performance improvements in the implementation can improve the 1,000 TPS single head measurement, further adding to the protocol’s hypothetical performance.
So, can we just reach any TPS number that we want? In theory the answer is a solid yes, and that points to a problem with the dominant usage of TPS as a metric to compare systems. While it is tempting to reduce the complexity of assessing protocol performance to a single number, in practice this leads to an oversimplification. Without further context, a TPS number is close to meaningless. In order to properly interpret it, and make comparisons, you should at least ask for the size of the cluster (which influences the communication overhead); its geographic distribution (which determines how much time it takes for information to transit through the system); how the quality of service (transaction confirmation times, providing data to end users) is impacted by a high rate of transactions; how large and complicated the transactions are (which has an impact on transaction validation times, message propagation time, requirements on the local storage system, and composition of the head participants); and what kind of hardware and network connections were used in the experiments. Changing the complexity of transactions alone can change the TPS by a factor of three, as can be seen in the figures in the paper (refer to Section 7 – Simulations).
Clearly, we need a better standard. Is the Hydra head protocol a good protocol design? What we need to ask is whether it reaches the physical limits of the network, not a mere TPS number. Thus, for this first iteration of the evaluation of the Hydra head protocol, we used the following approach to ensure that the data we provide is properly meaningful:
- We clearly list all the parameters that influence the simulation: transaction size, time to validate a single transaction, time needed for cryptographic operations, allocated bandwidth per node, cluster size and geographical distribution, and limits on the parallelism in which transactions can be issued. Without this controlled environment, it would be impossible to reproduce our numbers.
- We compare the protocol’s performance to baselines that provide precise and absolute limits of the underlying network and hardware infrastructure. How well we approach those limits tells us how much room there would be for further improvements. This follows the methodology explained above using the example of an encryption algorithm.
We use two baselines for Hydra. The first, Full Trust, is universal: it applies to any protocol that distributes transactions amongst nodes and insists that each node validate transactions one after the other – without even ensuring consensus. This yields a limit on TPS by simply adding the message delivery and validation times. How well we approach this limit tells us what price we are paying for consensus, without relying on comparison with other protocols. The second baseline, Hydra Unlimited, yields a TPS limit specifically for the head protocol and also provides the ideal latency and storage for any protocol. We achieve that by assuming that we can send enough transactions in parallel to completely amortize network round-trip times and that all actions can be carried out when needed, without resource contention. The baseline helps us answer the question of what can be achieved under ideal circumstances with the general design of Hydra (for a given set of values of the input parameters) as well as evaluate confirmation latency and storage overhead against any possible protocol. More details and graphs for those interested can be found in our paper (again, Section 7 – Simulations).
What comes next?
Solving the scalability question is the holy grail for the whole blockchain space. The time has come to apply a principled, evidence-based approach in designing and engineering blockchain scalability solutions. Comparing scalability proposals against well-defined baselines can be a significant aide in the design of such protocols. It provides solid evidence for the appropriateness of the design choices and ultimately leads to the engineering of effective and performant distributed ledger protocols that will provide the best possible absolute metrics for use cases of interest. While the Hydra head protocol is implemented and tested, we will, in time, release the rest of the Hydra components following the same principled approach.
As a last note, Hydra is the joint effort of a number of researchers, whom I'd like to thank. These include Manuel Chakravarty, Sandro Coretti, Matthias Fitzi, Peter Gaži, Philipp Kant, and Alexander Russel. The research was also supported, in part, by EU Project No.780477, PRIVILEDGE, which we gratefully acknowledge.
In a proof of stake (PoS) blockchain protocol, the ledger is maintained by the stakeholders that hold assets in that ledger. This allows PoS blockchains to use less energy compared with proof of work (PoW) or other types of blockchain protocols. Nevertheless, this requirement imposes a burden on stakeholders. It requires a good number of them to be online and maintain sufficiently good network connectivity that they can collect transactions and have their PoS blocks reach the others without substantial network delays. It follows that any PoS ledger would benefit from reliable server nodes that hold stake and focus on maintenance.
The argument for stake pools
Wealth is typically distributed according to a power-law such as the Pareto distribution, so running reliable nodes executing the PoS protocol may be an option only for a small, wealthy, subset of stakeholders, leaving most without the ability to run such services. This is undesirable; it would be better if everyone had the ability to contribute to ledger maintenance. An approach to rectify this problem is by allowing the creation of stake pools. Specifically, this refers to the ability of stakeholders to combine their stake and form a single entity, the stake pool, which can engage in the PoS protocol using the total stake of its members. A pool will have a manager who will be responsible for running the service that processes transactions. At the same time, the pool manager should not be able to spend the stake that their pool represents, while members who are represented by the pool should be free to change their mind and reallocate their stake if they wish to another pool. Finally, and most importantly, any stakeholder should be able to aspire to become a stake pool manager.
Participating in PoS ledger maintenance incurs costs. Certainly not as high as in the case of a PoW protocol but, nevertheless, still significant. As a result, it is sensible that the community of all stakeholders incentivizes in some way those who support the ledger by setting up servers and processing transactions. This can be achieved by a combination of contributions from those that use the ledger (in the form of transaction fees) and inflation of the circulating supply of coins (by introducing new coins in circulation to be claimed by those engaged in the protocol).
In the case of Bitcoin, we have both the above mechanisms, incentivization and pools. On the one hand, mining is rewarded by transaction fees as well as a block reward that is fixed and diminishes over time following a geometric series. On the other hand, pools can be facilitated by dividing the work required for producing blocks among many participants and using ‘partial’ PoWs (which are PoWs that are of smaller difficulty than the one indicated by the current state of the ledger) as evidence of pool participation.
It is straightforward to apply a similar type of incentivization mechanism in the PoS setting. However, one should ask first whether a Bitcoin-like mechanism (or any mechanism for that matter) would converge to a desirable system configuration. Which brings us to the important question: what are the desirable system configurations? If the only consideration is to minimize transaction processing costs, in a failure-free environment, the economically optimal configuration is a dictatorial one. One of the parties maintains the ledger as a service while all the others participate in the pool created by this party. This is clearly an undesirable outcome because the single pool leader becomes also a single point of failure in the system, which is exactly the type of outcome that a distributed ledger is supposed to avoid. It follows that the coexistence of many pools, in other words decentralization, should be a desirable characteristic of the ledger incentivization mechanism.
Reward-sharing schemes for PoS
So what would a reward-sharing scheme look like in a PoS setting? Rewards should be provided at regular intervals and pool maintenance costs should be retained by the pool manager before distributing the remaining rewards among the members. Given that it is possible to keep track of pool membership in the ledger itself using the staking keys of the participants, reward splits within each pool can be encoded in a smart contract and become part of the ledger maintenance service. First things first, pool managers should be rewarded for their entrepreneurship. A pool creation certificate posted on the ledger will declare a profit margin to be shaved off the pool’s rewards after subtracting operational costs, which should also be declared as part of the pool creation certificate. The cost declaration should be updated frequently to absorb any volatility that the native token of the system has with respect to the currency that denominates the actual costs of the pool manager. At the same time, the pool creation certificate, backed up by one or more staking keys provided by stakeholders, can declare a certain amount of stake that “stands behind” the pool and can be used either as an indication that the pool represents the genuine enterprise of one or more stakeholders or as collateral guaranteeing compliance with correct protocol behavior.
Given the above setup, how do Bitcoin-like mechanisms fare with respect to the decentralization objective? In Bitcoin, assuming everyone follows the protocol, pool rewards are split in proportion to the size of each pool. For example, a mining pool with 20% of the total hashing power is expected to reap 20% of the rewards. This is because rewards are proportional to the number of blocks obtained by the pool and the number of blocks is in turn proportional to the pool’s mining power. Does this lead to a decentralized system? Empirical evidence seems to suggest otherwise: in Bitcoin, mining pools came close (and occasionally even exceeded) the 50% threshold that is the upper boundary for ensuring the resilience of the ledger. A simple argument can validate this empirical observation in the framework of our reward-sharing schemes: if pools are rewarded proportionally to their size and pool members proportionally to their stake in the pool, the rational thing to do would be to centralize to one pool. To see this consider the following. At first, it is reasonable to expect that all players who are sufficiently wealthy to afford creating a pool will do so by setting up or renting server equipment and promoting it with the objective to attract members so that their share of rewards grows. The other stakeholders that are not pool managers will join the pool that maximizes their payoff, which will be the one with the lowest cost and profit margin. Pool competition for gaining these members will compress profit margins to very small values. But even with zero profit margin, all other pools will lose to the pool with the lowest cost. Assuming that there are no ties, this single pool will attract all stakeholders. Finally, other pool managers will realize that they will be better off joining that pool as opposed to maintaining their own because they will receive more for the stake they possess. Eventually, the system will converge to a dictatorial single pool.
Figure 1 shows a graphical representation of this. It comes from one of the numerous simulations our team has conducted in the process of distilling effective reward sharing schemes. In the experiment, a number of stakeholders follow a reactive process where they attempt to maximize their payoff based on the current system configuration. The experiment leads to a centralized single pool, validating our theoretical observations above for Bitcoin-like schemes. From a decentralization perspective, this is a tragedy of the commons: even though the participants value decentralization as an abstract concept, none of them individually wants to bear the burden of it.
A better reward sharing scheme
Clearly we have to do better than a dictatorship! A first observation is that if we are to achieve decentralization, linearity between rewards and size should taper off after a certain level. This is because, while linearity is attractive when the pool is small and wants to attract stakeholders, after a certain level it should be diminished if we want to give an opportunity for smaller pools to be more competitive. Thus, we will divide the behavior of the reward-sharing scheme depending on the size of the pool to two stages: a growth stage, when linearity is to be respected, and a stabilization stage when the pool is large enough. The point where the transition happens will be called the saturation point and the pool that has passed this point will be saturated. We can fix rewards to be constant after the saturation point, so that if the saturation point is 1%, two pools, with total stakes of 1% and 1.5%, will receive the same rewards.
To appreciate how the dynamics work from the perspective of a single stakeholder, consider the following example. Suppose there are two pools, A and B managed by Alice and Bob, with operational costs of 25 and 30 coins respectively, each one with a profit margin of 4%. Suppose further that the total rewards to be distributed are 1,000 coins and the saturation point of the reward-sharing mechanism is 20%. At a given point in time, Alice’s pool has 20% of the stake, so it is at the saturation point, while Bob’s pool is at 19%. A prospective pool member, Charlie, holds 1% of the stake and considers which pool to join. Joining Alice’s pool will bring its total stake to 21%, and because it has exceeded the saturation point the reward will be 200 coins (20% of the total rewards). Deducting operational costs will leave 175 coins to be distributed between Alice and the pool members. After removing Alice’s profit margin and considering Charlie’s relative stake in the pool, he will receive 8 coins as a reward. If Charlie joins Bob’s pool, the total rewards will be 200 coins, or 170 coins after removing the operational costs. However, given that Charlie’s stake is 5% (1/20) of the pool, it turns out that he will receive 2% more coins than if he had joined Alice’s pool. So Charlie will join Bob’s pool if he wants to maximize his rewards.
Now, let us see what happens in the case that Charlie is facing the same decision at a hypothetical earlier stage of the whole process when Alice’s pool was already at 20% of the total stake, while Bob’s pool was only at 3%. In this case, Bob has a very small pool and the total rewards available for its members are much less compared with the previous case. As a result, if Charlie did the same calculation for Bob’s pool, his 1% stake would result in a 4% total stake for the pool but, if one does the calculations, he would receive a mere 30% of the rewards that he would have obtained had he joined Alice’s pool. In such a case, the rational decision is to join Alice’s pool despite the fact that his membership will make Alice’s pool exceed the saturation point. Refer to Table 1 below for the exact figures.
Being far-sighted matters
The above appears to be contradictory. To understand what Charlie needs to do we have to appreciate the following fact. The choice of Charlie to join Alice’s pool in the second scenario is only rational in a very near-sighted (aka myopic) sense. In fact, Charlie is better off with Bob’s pool, as is demonstrated by the first scenario, as long as Bob’s pool reaches the saturation point. Thus, if Charlie believes that Bob’s pool will reach the saturation point, the rational choice should be to support it. Other stakeholders will do the same and thus Bob’s pool will rapidly reach the saturation point making everyone that participated in it better off, while also supporting the ideal of decentralization: Alice’s pool instead of constantly growing larger will stop at the saturation point and other pools will be given the ability to grow to the same size. This type of strategic thinking on behalf of the stakeholders is more far-sighted (aka non-myopic) and, as we will see, has the ability to help parties converge to desirable decentralized configurations for the system.
It is worth noting that it is unavoidable that the system in its evolution will reach pivotal moments where it will be crucial for stakeholders to exercise far-sighted thinking, as in the scenario above where Alice’s pool reaches the saturation point while other pools are still quite small. The reason is that due to the particular circumstances of each stake pool manager, the operational costs will be variable across the stakeholder population. As a result, it is to be expected that starting from a point zero where no stake pools exist, the pool with the lowest operational cost will be also the one that will be the first to grow. This is natural since low operational costs leave a higher level of rewards to be split among the pool members. It is to be expected that the system will reach moments like the second scenario above where the most competitive pool (the one of Alice with operational cost 25) has reached saturation point while the second-most competitive (the one of Bob with operational cost 30) is still at a small membership level.
One might be tempted to consider long-term thinking in the setting of a Bitcoin-like reward sharing schemes and believe that it can also help to converge to decentralization. Unfortunately, this is not the case. In a Bitcoin-like scheme, contrary to our reward-sharing scheme with a saturation point, there is no point in the development of Alice’s and Bob’s pools when Bob’s pool will become more attractive in Charlie’s view. Indeed, without a saturation point, Alice’s bigger pool will always offer more rewards to Charlie: this stems from the fact that the operational costs of Alice are smaller and hence leave more rewards for all the stakeholders. This will leave Bob’s pool without any members, and eventually, as discussed above, it will be the rational choice for Bob also to dissolve his pool and join Alice’s, making Alice the system’s dictator.
Going back to our reward-sharing scheme, we have established that non-myopic strategic thinking promotes decentralization; nevertheless, there is an important point still open. At a pivotal moment, when the non-myopic stakeholder Charlie rationally decides to forgo the option to join Alice’s saturated pool, he may have a number of aspiring pools to choose from. For instance, together with Bob’s pool that has operational costs of 30 and profit margin 4%, there could be a pool by Brenda with operational cost of 33 and profit margin 2%, and a pool by Ben with operational cost of 36 and profit margin 1%. The rational choice would be to go with the one that will reach the saturation point; is there a way to tell which one would be the best choice? In our full analysis paper we provide an explicit mechanism that orders the pools according to their desirability and, using the information recorded in the ledger about each stake pool, it can assist stakeholders in making the best possible choice at any given moment. In our example, it is Brenda’s pool that Charlie should join if he wants to maximize his rewards (see Table 1). To aid Cardano users, the pool-sorting mechanism will be built into Daedalus (and other Cardano-compatible wallets) and will provide a visual representation of the best choices available to stakeholders using the information in the ledger regarding pool registrations.
So how does our reward scheme fare with respect to decentralization? In the full analysis paper we prove that there is a class of decentralized system configurations that are “non-myopic Nash equilibria.” An equilibrium strategy here means that stakeholders have a specific way to create pools, set their profit margins and/or delegate to other pools, so that no stakeholder, thinking for the long term, is better off following a different strategy. Moreover, we demonstrate experimentally that reactive play between stakeholders with non-myopic thinking converges to this equilibrium in a small number of iterations, as shown in Figure 2.
A characteristic of our approach is that the number of pools is only part of the description of the reward-sharing scheme and thus is in no way enforced by the system on the stakeholders. This means stakeholders are free to experiment with pool creation and delegation of stake without having to conform to any predetermined system architecture. This is in contrast to other approaches taken in PoS systems such as EOS where the number of participants is a hardcoded parameter of the consensus system (specifically, 21 pools). At the same time, our approach allows the whole stakeholder set to to express its will, by freely joining and leaving pools, receiving guaranteed rewards for their participation while witnessing how their actions have a quantifiable impact on the management of the PoS distributed ledger no matter the size of their stake. This is contrast to other approaches taken in PoS systems such as Ethereum 2.0 where ledger maintenance is performed by registered validators on the basis of a collateral deposit without a built-in process of vetting by the stakeholder set.
So what would be a sensible choice for the number of pools that should be favored by the reward scheme for Cardano? Given that decentralization is our main objective, it is sensible to set this parameter to be as high as possible. Our network experiments showed that the system can still operate effectively with as many as 1,000 running pools. Choosing a saturation threshold for our reward-sharing scheme based on this number will make having a stake pool profitable even if the total stake delegated in them is as little as 0.1% of the total circulation of Ada.
Looking ahead – Sybil attacks
Given that decentralization can be achieved by a large number of independent stake pools, it is also important to see whether some decentralized system configurations are more preferable than others. As described so far in this post, our reward-sharing scheme will lead rational stakeholders towards promoting the stake pools that will incur the smallest total cost. Even though this maximizes rewards and minimizes costs, it may not be necessarily the most desirable outcome. The reason is that in the equilibrium point one may see a set of stakeholders promoted as stake pool managers who possess collectively a very small stake themselves. This imbalance, in which a small total stake represents the total stake of the system, can be detrimental in many ways: stake pool managers may be prone to corruption or bribery, or, perhaps even worse, a large stake holder may register many stake pools in the hope of controlling the whole ecosystem, performing in this way a Sybil attack that would hurt decentralization. For this reason, the reward-sharing scheme as presented in our full analysis paper is suitably modified to be sensitive to the stake backing the pool so that this type of behaviour is mitigated. We will delve deeper into this aspect of Cardano reward-sharing in the next blog post.
Artwork, Mike Beeple
Prof Aggelos Kiayias