Monday, February 18, 2019

The Problem with Avalanche (BCH & Ava)

[

update #3.  Here's my rant in a comment to their Sept 26, 2019 dev meeting

Avalanche is not a consensus mechanism for two related reasons: it does not quantify the voting population or detect network partitions. Not having Sybil or eclipse protection is not as big of a problem.   It proves consensus only among its peers without knowing what the wider network thinks, even if it has Sybil & eclipse protection.  It does not meet the "agreement" requirement mentioned in Wikipedia to be called a consensus mechanism. See Leslie Lamport's requirements for consensus and Coda Hale's "You Can’t Sacrifice Partition Tolerance" as an example of a researcher getting exasperated with people calling algorithms like Avalanche a consensus mechanism.  Nakamoto consensus was Earth-shattering in its ability to get consensus in a distributed permissionless setting with Sybil, Eclipse, and partition resolution (not just detection via slow solvetimes). VDF-POS is the only alternative (POS alone requires more excessive bandwidth as centralization & permission are increased). If you find something better like centralized staking for post-consensus, then you do not need Nakamoto consensus because you're unconsciously doing POS where Avalanche gets fast "consensus" at the cost of ruining partition detection which means you must let a real consensus mechanism override it.  I bugged deadalnix & Emin about this 9 months ago and their position is "partitions are rare". That's true, but if you get Sybil protection, you can still have an eclipse problem and more importantly it means it must not be the final say in consensus. Even if you let POW override it, when you get close to something working you'll realize a simpler semi-centralized technique using classical consensus will work better because Avalanche is only useful for quickly resolving the opinion of a large set of voters. If you have a large set, your Sybil protection is going to require a lot of communication. A Sybil solution may also contain  partition & eclipse protection, but keep in mind this conjecture: you can't carry Sybil etc protection over to the speed of Avalanche in a way that maintains the combination of protection, speed, and a level of decentralization that exceed POS + classical methods. Maybe there is a reason the Avalanche researchers want to be anonymous. They are clearly well-published, so why hide when publishing this?  If you use a smaller set of voters, I think you'll find a better solution such as semi-centralized mempools using classical consensus to prevent double spends. So merchants would trust the mempools for small-valued txns but realize they are not a guarantee like the actual blocks. If all nodes agreeing on individual txns are used, then Avalanche can be used, but it's only a suggestion for merchants and miners. To avoid full-blown POS that makes the consensus part of POW pointless, you would just not worry about Sybil protection, so POW would retain the right to over-rule the centralized mempool or node-based Avalanche.  The potential for preventing 51% attacks can only be achieved if you are basically subtly switching to a POS coin, not using the POW as consensus.  It makes no sense to keep Nakamoto consensus if you're going to overrule it with pre- and post- consensus.  You can just use POW in self-hashing txs to generate and distribute coin and just throw Nakamoto consensus out the window. VDF-POS as I've described is the only other option.

If you want fast consensus that maintains Nakamoto consensus, use a DAG.  See the issue in my github for how to do a DAG.


]

[
 Update #2. I recently learned BCH may allow past 100 block winners to be a committee to participate in Avalanche to confirm txs. This is to provide Sybil protection which was my main complaint below.  However, Avalanche's main benefit in speed with little communication by sampling only your peers, hoping they are connected to a much larger network. With only 100 blocks (and maybe only 20 actual distinct mines or pools in the committee), there seems to be little to no advantage over classical consensus methods which will have the advantage of proving consensus instead of hoping for it. Avalanche is not a consensus mechanism because it does not prove agreement between all non-faulty nodes (see Wikipedia). It can't know if a majority consensus has been reached because it does not quantify membership participation. It does not know if the network is split (as in a DoS or eclipse attack that can be combined with a double spend) with each side giving different results. All it knows is if your immediate peers agree. See this tweet thread for more of my more recent comments on Avalanche and 0-conf
https://twitter.com/zawy3/status/1174006755925417986

]



[ update #1:  I believe BCH and Ava are using Avalanche advantageously in this way: if the recipient is confident there is not a 51% attack or network partition in progress, then he can be a sure double spend will not be allowed.  But it invites 33% Sybil attacks on nodes (either locally or globally) to trick nodes into pre-approving txns that POW will have to overturn. The difference between a DAG and Avalanche is that a DAG measures network hashrate integrity by having lots of blocks solved quickly. Neither avoids POW's 51% problem.  ]

The problem with Avalanche is that it assumes a high level of node participation (the "membership" must not change too much, section 3.7).  So there's no protection against network partitions. It assumes the network remains largely intact and does not say what happens when the minority side comes to a different conclusion. There's no mechanism to tell which is the larger side. The authors said they would address this in a later paper, which is harder to do than Avalanche itself. It achieves Consistency and Availability but assumes there is no network Partition. Ava and BCH said partitions are not part of the real world, but if they were not a big issue,  Nakamoto (POW) consensus did not need inventing.

POW's magic is in selecting chain history with the least sum of partitions (via highest cumulative work) with only one voting (hashrate) member (miner) per election (block) needing to communicate that he won, and everyone immediately agreeing without even communicating an acknowledgment. The next vote begins without any other communication. The size of the voting membership (hashrate) is also determined from those single winning announcements, which set a variable for the next election to get an accurate average block time.  It's an amazing achievement. An enormous amount of communication overhead is avoided by making the voters work. No membership list is needed because POW does not prove there was no partition. It only proves the chain had the route of least partitions, assuming a 51% attack has not or will not occur.

If there is a network partition with Avalanche that coincides with conflicting spends on each side of the partition, the network is permanently forked. There's no mechanism to tell nodes which fork is correct unless it defaults back to POW. But if it defaults back to POW, the hidden chain's double-spends will overwrite the Avalanche-approved txns.  BCH said their implementation will allow miners to include txns that Avalanche has not voted on (and not required to include Avalanche-approved txns...both as a way to claim POS is not superseding POW).  This means there is no protection against a double spend because the attacker only needs to get one block on the public chain to include txns that did not receive Avalanche approval, paving the way for double-spends on the hidden chain.

I've tried to come up with ways to "repair" Avalanche with membership metrics that will enable it to detect network partitions. Fast finality, if not all basic POS, requires proof of sufficient network integrity. If centralization is to be avoided, the nodes must independently conclude the necessary percentage of voting members are known to be participating. This is not trivial. I assume Casper and Dfinity are solving this problem in complicated (suspect) ways.  I'm attempting my own design in a future post.





1 comment:

  1. Really great post.

    Do you have any more writings about avalanche consensus?

    ReplyDelete