This is an explanation of the key function on Proof-of-Work in the Bitcoin blockchain. It focuses on the one feature of Proof-of-Work that is essential and shows that other features often talked about such as security are secondary side-effects, useful, but not essential.
This explanation rests on illustrating a few interesting properties of how Proof-of-Work is used in the blockchain that are not immediately obvious and sometimes are rather counter-intuitive, for example how participants collectively solve a problem without ever communicating.
Having understood each of these properties, one should conclude that Proof-of-Work is primarily a mechanism which accomplishes a distributed and decentralized system of timing, i.e. a clock.
Note that this write up isn’t about Proof-of-Work per se, it explains how the blockchain takes advantage of it. If you do not know anything about Proof-of-Work, then this link might be a good start.
The Decentralized Ledger Time Ordering Problem
Before describing the solution, let us focus on the problem. Much of the literature around Proof-of-Work is so confusing because it attempts to explain the solution without first identifying the problem.
Any ledger absolutely needs order. One cannot spend money that has not been received, nor can one spend money that is already spent. Blockchain transactions (or blocks containing them) must be ordered, unambiguously, and without the need for a trusted third party.
Even if the blockchain was not a ledger but just data like a log of some sort, for every node to have an identical copy of the blockchain, order is required. A blockchain in a different order is a different blockchain.
But if transactions are generated by anonymous participants all over the world, and no central party is responsible for organizing the list, how can it be done? For example transactions (or blocks) could include timestamps, but how could these timestamps be trusted?
Time is but a human concept, and any source of it, such as an atomic clock, is a “trusted third party”. Which, on top of everything, is slightly wrong most of time due to network delays as well as the effects of Relativity. Paradoxically, relying on a timestamp to determine event order is not possible in a decentralized system.
The “time” we are interested in is not the year, month, day, etc. that we are used to. What we need is a mechanism by which we can verify that one event took place before another or perhaps concurrently.
First though, for the notions of before and after to be applicable, a point in time needs to be established. Establishing a point in time may seem theoretically impossible at first because there is no technology accurate enough to measure a Planck. But as you’ll see, Bitcoin works around this by creating its own notion of time where precise points in time are in fact possible.
This problem is well described in Leslie Lamport’s 1978 paper “Time, Clocks, and the Ordering of Events in a Distributed System” which doesn’t actually provide a comprehensive solution other than “properly synchronized physical clocks”. In 1982 Lamport also described the “Byzantine Generals Problem”, and Satoshi in one of his first emails explains, how Proof-of-Work is a solution, though the Bitcoin paper states “To implement a distributed timestamp server on a peer-to-peer basis, we will need to use a proof-of-work system”, suggesting that it primarily solves the issue of timestamping.
Timing is the Root Problem
It must be stressed that the impossibility of associating events with points in time in distributed systems was the unsolved problem that precluded a decentralized ledger from ever being possible until Satoshi Nakamoto invented a solution. There are many other technical details that play into the blockchain, but timing is fundamental and paramount. Without timing there is no blockchain.
Very briefly, the Bitcoin Proof-of-Work is a value whose SHA-2 hash conforms to a certain requirement which makes such a value difficult to find. The difficulty is established by requiring that the hash is less than a specific number, the smaller the number, the more rare the input value and the higher the difficulty of finding it.
It is called “Proof Of Work” because it is known that a value with such a hash is extremely rare, which means that finding such a value requires a lot of trial and error, i.e. “work”. Work in turn implies time.
By varying the requirement, we can vary the difficulty and thus the probability of such a hash being found. The Bitcoin Difficulty adjusts dynamically so that a proper hash is found on average once every ten minutes.
Nothing Happens Between Blocks
The state of the chain is reflected by its blocks, and each new block produces a new state. The blockchain state moves forward one block at a time, and the average 10 minutes of a block is the smallest measure of blockchain time.
SHA is Memoryless and Progress-Free
The Secure Hash Algorithm is what is known in statistics and probability as memoryless. This is a property that is particularly counter-intuitive for us humans.
The best example of memoryless-ness is a coin toss. If a coin comes up heads 10 times in a row, does it mean that the next toss is more likely to be tails? Our intuition says yes, but in reality each toss has a 50/50 chance of either outcome regardless of what happened immediately prior.
Memorylessness is required for the problem to be progress-free. Progress-free means that as miners try to solve blocks iterating over nonces, each attempt is a stand-alone event and the probability of finding a solution is constant at each attempt, regardless of how much work has been done in the past. In other words at each attempt the participant is not getting any “closer” to a solution or is making no progress. And a miner who’s been looking for a solution for a year isn’t more likely to solve a block at the next attempt than a miner who started a second ago.
The probability of finding the solution given a specific difficulty in a given period of time is therefore determined solely by the speed at which all participants can iterate through the hashes. Not the prior history, not the data, just the hashrate.
The hashrate in turn is a function of the number of participants and the speed of the equipment used to calculate the hash.
The SHA Input is Irrelevant
In the Bitcoin blockchain the input is a block header. But if we just fed it random values, the probability of finding a conforming hash would still be the same. Regardless of whether the input is a valid block header or bytes from /dev/random, it is going to take 10 minutes on average to find a solution.
Of course if you find a conforming hash but your input wasn’t a valid block, such a solution cannot be added to the blockchain, but it is still Proof-of-Work (albeit useless).
The Difficulty is Intergalactic
Curiously, the difficulty is universal, meaning it spans the entire universe. We could have miners on Mars helping out, they do not need to know, or communicate with the Earth miners, the problem would still be solved every 10 minutes. (Ok, they’ll need to somehow tell the Earth people that they solved it if they do, or else we’ll never know about it.)
Remarkably, the distant participants are communicating without actually communicating, because they are collectively solving the same statistical problem and yet they’re not even aware of each other’s existence.
This “universal property” while at first seemingly magical is actually easy to explain. I used the term “universal” because it describes it well in one word, but really it means “known by every participant”.
The input to SHA-256 can be thought of as an integer between 0 and 2256 (because the output is 32 bytes, i.e. also between 0 and 2256, anything larger guarantees a collision, i.e. becomes redundant). Even though it is extremely large (exponentially larger than the number of atoms in the perceivable universe), it is a set of numbers that is known by every participant and the participants can only pick from this set.
If the input set is universally known, the function (SHA-256) is universally known, as well as the difficulty requirement is universally known, then the probability of finding a solution is also indeed “universal”.
Trying a SHA Makes You a Participant
If the stated problem is to find a conforming hash, all you have to do is to try it once, and bingo, you’ve affected the global hash rate, and for that one attempt you were a participant helping others solve the problem. You did not need to tell others that you did it (unless you actually found a solution), others didn’t need to know about it, but your attempt did affect the outcome. For the whole universe, no less.
If the above still seems suspicious, a good analogy might be the problem of finding large prime numbers. Finding the largest prime number is hard and once one is found, it becomes “discovered” or “known”. There is an infinite number of prime numbers, but only one instance of each number in the universe. Therefore whoever attempts to find the largest prime is working on the same problem, not a separate instance of it. You do not need to tell anyone you decided to look for the largest prime, you only need to announce when you find one. If no one ever looks for the largest prime, then it is never going to be found. Thus, participation (i.e. an attempt to find one), even if it’s in total secrecy, still affects the outcome, as long as the final discovery (if found at all) is publicized.
Taking advantage of this mind-boggling statistical phenomenon whereby any participation affects the outcome even if in complete secrecy and without success, is what makes Satoshi’s invention so remarkably brilliant.
It is noteworthy that since SHA is progress-free, each attempt could be thought of as a participant joining the effort and immediately leaving. Thus miners join and leave, quintillions of times per second.
The Participation is Revealed in Statistics
The magical secret participation property also works in reverse. The global hashrate listed on many sites is known not because every miner registered at some “miners registration office” where they report their hash rate periodically. No such thing exists.
The hash rate is known because for a solution of a specific difficulty to be found in 10 minutes, on average this many attempts (~1021 as of this writing) had to have been made by someone somewhere.
We do not know who these participants are, they never announced that they are working, those who did not find a solution (which is practically all of them) never told anyone they were working, their location could have been anywhere in the universe, and yet we know with absolute certainty that they exist. Simply because the problem continues to be solved.
Work is a Clock
And there is the crux of it: The difficulty in finding a conforming hash acts as a clock. A universal clock, if you will, because there is only one such clock in the universe, and thus there is nothing to sync and anyone can “look” at it.
It doesn’t matter that this clock is imprecise. What matters is that the this is the same clock for everyone and that the state of the chain can be tied unambiguously to the ticks of this clock.
This clock is operated by the multi-exahash rate of an unknown number of collective participants spread across the planet, completely independent of one another.
Last Piece of the Puzzle
The solution must be the hash of a block (the block header, to be precise). As we mentioned, the input doesn’t matter, but if it is an actual block, then whenever a solution is found, it happened at the tick of our Proof-of-Work clock. Not before, not after, but exactly at. We know this unambiguosly because the block was part of that mechanism.
To put it another way, if blocks weren’t the input to the SHA256 function, we’d still have a distributed clock, but we couldn’t tie blocks to the ticks of this clock. Using blocks as input addresses this issue.
Noteworthy, our Proof-of-Work clock only provides us with ticks. There is no way tell order from the ticks, this is what the Merkle tree is for.
What About the Distributed Consensus?
Consensus means agreement. What all participants have no choice but to agree on is that the clock has ticked. Also that everyone knows the tick and the data attached to it. And this, in fact, does solve the Byzantine Generals Problem, as Satoshi explained in an email referenced earlier.
There is a separate consensus in a rare but common case of two consecutive ticks being associated with conflicting blocks. The conflict is resolved by what block will be associated with the next tick, rendering one of the disputed blocks “orphan”. How the chain will continue is a matter of chance, and so this too could probably be indirectly attributed to the Proof-of-Work clock.
And that is it
This is what Proof-of-Work does for the blockchain. It is not a “lottery” where miners win the right to solve a block, nor is it some peculiar conversion of real energy into a valuable concept, those are all red herrings.
For example the lottery and the miner’s reward aspect is what encourages miners to participate, but it isn’t what makes the blockchain possible. Blocks are a Merkle tree, but again, that has nothing to do with Proof-of-Work, it cryptographically reinforces recording of the block ordering. The Merkle tree also makes the previous ticks “more certain”, “less deniable” or simply more secure.
Proof-of-Work is also the mechanism by which blocks become effectively immutable, and that’s a nice side-effect which makes Segregated Witness possible, but it could just as well be done by preserving the signatures (witness), so this too is secondary.
The Bitcoin blockchain Proof-of-Work is simply a distributed, decentralized clock.
If you understand this explanation, then you should have a much better grasp of how Proof-of-Work compares to Proof-of-Stake, and it should be apparent that the two are not comparable: Proof-Of-Stake is about (randomly distributed) authority, while Proof-of-Work is a clock.
In the context of the blockchain, Proof-of-Work is probably a misnomer. The term is a legacy from the Hashcash project, where it indeed served to prove work. In the blockchain it is primarily about verifiably taking time. When one sees a hash that satisfies the difficulty, one knows it must have taken time. The method by which the delay is accomplished is “work”, but the hash is primarily interesting because it is a proof of time.
The fact that Proof-of-Work is all about time rather than work also suggests that there may be other similar statistical challenges that are time-consuming but require less energy. It may also mean that the Bitcoin hashrate is excessive and that the Bitcoin clock we described above could operate as reliably on a fraction of the hashrate, but it is the incentive structure that drives up the energy consumption.
Figuring out a way to pace ticks with less work is a trillion dollar problem, if you find one, please do let me know!