This is an explanation of the key function on Proof-of-Work in the Bitcoin blockchain. It focuses on the one feature of Proof-of-Work that is essential and shows that other features often talked about such as security are secondary side-effects, useful, but not essential.

This explanation rests on illustrating a few interesting properties of
how Proof-of-Work is used in the blockchain that are not immediately
obvious and sometimes are rather counter-intuitive, for example how
participants collectively solve a problem without *ever
communicating*.

Having understood each of these properties, one should conclude that Proof-of-Work is primarily a mechanism which accomplishes a distributed and decentralized system of timing, i.e. a clock.

Note that this write up isn’t about Proof-of-Work *per se*, it explains how the
blockchain takes advantage of it. If you do not know anything about Proof-of-Work,
then this link might be a
good start.

## The Decentralized Ledger Time Ordering Problem

Before describing the solution, let us focus on the problem. Much of the literature around Proof-of-Work is so confusing because it attempts to explain the solution without first identifying the problem.

Any ledger absolutely needs order. One cannot spend money that has not been received, nor can one spend money that is already spent. Blockchain transactions (or blocks containing them) must be ordered, unambiguously, and without the need for a trusted third party.

Even if the blockchain was not a ledger but just data like a log of some sort, for every node to have an identical copy of the blockchain, order is required. A blockchain in a different order is a different blockchain.

But if transactions are generated by anonymous participants all over the world, and no central party is responsible for organizing the list, how can it be done? For example transactions (or blocks) could include timestamps, but how could these timestamps be trusted?

Time is but a human concept, and any source of it, such as an atomic clock, is a “trusted third party”. Which, on top of everything, is slightly wrong most of time due to network delays as well as the effects of Relativity. Even time dilation between someone in an airplane vs the ground, though minute, is sufficient to make ordering impossible. Paradoxically, relying on a timestamp to determine event order is not possible in a decentralized geographically dispersed system.

The “time” we are interested in is not the year, month, day, etc. that we are used to. What we need is a mechanism by which we can verify that one event took place before another or perhaps concurrently.

First though, for the notions of before and after to be applicable, a
*point in time* needs to be established. Establishing a point in time
may seem theoretically impossible at first because there is no
technology accurate enough to measure a
Planck. But as you’ll
see, Bitcoin works around this by creating its own notion of time
where precise points in time are in fact possible.

This problem is well described in
Leslie Lamport’s 1978 paper
“Time, Clocks, and the Ordering of Events in a Distributed System”
which doesn’t actually provide a comprehensive solution other than
“properly synchronized physical clocks”. In 1982 Lamport also
described the “Byzantine Generals Problem”,
and Satoshi in one of his first emails explains,
how Proof-of-Work is a solution, though the Bitcoin paper states “To
implement a distributed *timestamp server* on a peer-to-peer basis, we
will need to use a proof-of-work system”, suggesting that it
primarily solves the issue of timestamping.

## Timing is the Root Problem

It must be stressed that the *impossibility of associating events with points in time*
in distributed systems was the unsolved problem that
precluded a decentralized ledger from ever being possible until
Satoshi Nakamoto invented a solution. There are many other technical details that
play into the blockchain, but timing is fundamental and paramount.
Without timing there is no blockchain.

## Proof-of-Work Recap

Very briefly, the Bitcoin Proof-of-Work is a value whose SHA-2 hash conforms to a certain requirement which makes such a value difficult to find. The difficulty is established by requiring that the hash is less than a specific number, the smaller the number, the more rare the input value and the higher the difficulty of finding it.

It is called “Proof Of Work” because it is known that a value with
such a hash is extremely rare, which means that finding such a value
requires a lot of trial and error, i.e. “work”. Work in turn implies
*time*.

By varying the requirement, we can vary the difficulty and thus the probability of such a hash being found. The Bitcoin Difficulty adjusts dynamically so that a proper hash is found on average once every ten minutes.

## Nothing Happens Between Blocks

The state of the chain is reflected by its blocks, and each new block produces a new state. The blockchain state moves forward one block at a time, and the average 10 minutes of a block is the smallest measure of blockchain time.

## SHA is Memoryless and Progress-Free

The Secure Hash Algorithm is what is known in statistics and
probability as *memoryless*.
This is a property that is particularly counter-intuitive for us humans.

The best example of memoryless-ness is a coin toss. If a coin comes up heads 10 times in a row, does it mean that the next toss is more likely to be tails? Our intuition says yes, but in reality each toss has a 50/50 chance of either outcome regardless of what happened immediately prior.

Memorylessness is required for the problem to be *progress-free*.
Progress-free means that as miners try to solve blocks iterating over
nonces, each attempt is a
stand-alone event and the probability of finding a solution is
constant at each attempt, regardless of how much work has been done in
the past. In other words at each attempt the participant is not
getting any “closer” to a solution or is making no progress. And a miner
who’s been looking for a solution for a year isn’t more likely to
solve a block at the next attempt than a miner who started a second ago.

The probability of finding the solution given a specific difficulty in
a given period of time is therefore determined *solely by the speed at
which all participants can iterate through the hashes*. Not the prior
history, not the data, just the hashrate.

The hashrate in turn is a function of the number of participants and the speed of the equipment used to calculate the hash.

(NB: Though strictly speaking SHA is not progress-free because there is a finite number of hashes, the range of a 256-bit integer is so vast that it is practically progress-free.)

## The SHA Input is Irrelevant

In the Bitcoin blockchain the input is a block header.
But if we just fed it random values, the probability of finding a
conforming hash would *still be the same*. Regardless of whether the
input is a valid block header or bytes from /dev/random, it is going
to take 10 minutes on average to find a solution.

Of course if you find a conforming hash but your input wasn’t a valid block, such a solution cannot be added to the blockchain, but it is still Proof-of-Work (albeit useless).

## The Difficulty is Intergalactic

Curiously, the difficulty is *universal*, meaning it spans the entire
universe. We could have miners on Mars helping out, they do not need
to know, or communicate with the Earth miners, the problem would still be
solved every 10 minutes. (Ok, they’ll need to somehow tell the Earth
people that they solved it if they do, or else we’ll never know about
it.)

Remarkably, the distant participants are communicating without actually communicating, because they are collectively solving the same statistical problem and yet they’re not even aware of each other’s existence.

This “universal property” while at first seemingly magical is actually easy to explain. I used the term “universal” because it describes it well in one word, but really it means “known by every participant”.

The input to SHA-256 can be thought of as an integer between 0 and
2^{256} (because the output is 32 bytes, i.e. also between 0
and 2^{256}, anything larger guarantees a collision,
i.e. becomes redundant). Even though it is extremely large
(exponentially larger
than the number of atoms in the perceivable universe), it is a set of numbers that is known by every participant
and the participants can only pick from this set.

If the input set is universally known, the function (SHA-256) is universally known, as well as the difficulty requirement is universally known, then the probability of finding a solution is also indeed “universal”.

## Trying a SHA Makes You a Participant

If the stated problem is to find a conforming hash, all you have to do
is to try it once, and bingo, you’ve affected the global hash rate,
and for that one attempt you were a participant helping others solve
the problem. You did not need to tell others that you did it (unless
you actually found a solution), others didn’t need to know about it,
but your attempt *did* affect the outcome. For the whole universe, no
less.

If the above still seems suspicious, a good analogy might be the problem of finding large prime numbers. Finding the largest prime number is hard and once one is found, it becomes “discovered” or “known”. There is an infinite number of prime numbers, but only one instance of each number in the universe. Therefore whoever attempts to find the largest prime is working on the same problem, not a separate instance of it. You do not need to tell anyone you decided to look for the largest prime, you only need to announce when you find one. If no one ever looks for the largest prime, then it is never going to be found. Thus, participation (i.e. an attempt to find one), even if it’s in total secrecy, still affects the outcome, as long as the final discovery (if found at all) is publicized.

Taking advantage of this mind-boggling probabilistic phenomenon whereby
any participation affects the outcome even if in complete secrecy and
without success, *is* what makes Satoshi’s invention so remarkably
brilliant.

It is noteworthy that since SHA is progress-free, each attempt could be thought of as a participant joining the effort and immediately leaving. Thus miners join and leave, quintillions of times per second.

## The Participation is Revealed in Statistics

The magical secret participation property also works in reverse. The global hashrate listed on many sites is known not because every miner registered at some “miners registration office” where they report their hash rate periodically. No such thing exists.

The hash rate is known because for a solution of a specific difficulty
to be found in 10 minutes, on average this many attempts
(~10^{21} as of this writing) had to have been made by someone
somewhere.

We do not know who these participants are, they never announced that they are working, those who did not find a solution (which is practically all of them) never told anyone they were working, their location could have been anywhere in the universe, and yet we know with absolute certainty that they exist. Simply because the problem continues to be solved.

## Work is a Clock

And there is the crux of it: The difficulty in finding a conforming
hash acts as *a clock*. A universal clock, if you will, because there
is only one such clock in the universe, and thus there is nothing to
sync and anyone can “look” at it.

It doesn’t matter that this clock is imprecise. What matters is that it is the same clock for everyone and that the state of the chain can be tied unambiguously to the ticks of this clock.

This clock is operated by the multi-exahash rate of an unknown number of collective participants spread across the planet, completely independent of one another.

## Last Piece of the Puzzle

The solution must be the hash of a block (the block header, to be precise). As
we mentioned, the input doesn’t matter, but if it is an actual block,
then whenever a solution is found, it happened at the tick of our
Proof-of-Work clock. Not before, not after, but *exactly at*. We know
this unambiguosly because the block was part of that mechanism.

To put it another way, if blocks weren’t the input to the SHA256 function, we’d still have a distributed clock, but we couldn’t tie blocks to the ticks of this clock. Using blocks as input addresses this issue.

Noteworthy, our Proof-of-Work clock only provides us with ticks. There is no way tell order from the ticks, this is what the hash chain is for.

## What About the Distributed Consensus?

Consensus means agreement. What all participants have no choice but to
agree on is that *the clock has ticked*. Also that everyone knows the
tick and the data attached to it. And this, in fact, does solve the
Byzantine Generals Problem, as Satoshi explained in an email
referenced earlier.

There is a separate consensus in a rare but common case of two consecutive ticks being associated with conflicting blocks. The conflict is resolved by what block will be associated with the next tick, rendering one of the disputed blocks “orphan”. How the chain will continue is a matter of chance, and so this too could probably be indirectly attributed to the Proof-of-Work clock.

## And that is it

This is what Proof-of-Work does for the blockchain. It is not a “lottery” where miners win the right to solve a block, nor is it some peculiar conversion of real energy into a valuable concept, those are all red herrings.

For example the lottery and the miner’s reward aspect is what encourages miners to participate, but it isn’t what makes the blockchain possible. Blocks hashes form a chain, but again, that has nothing to do with Proof-of-Work, it cryptographically reinforces recording of the block ordering. The hash chain also makes the previous ticks “more certain”, “less deniable” or simply more secure.

Proof-of-Work is also the mechanism by which blocks become effectively immutable, and that’s a nice side-effect which makes Segregated Witness possible, but it could just as well be done by preserving the signatures (witness), so this too is secondary.

## Conclusion

The Bitcoin blockchain Proof-of-Work is simply a distributed, decentralized clock.

If you understand this explanation, then you should have a much better grasp of how Proof-of-Work compares to Proof-of-Stake, and it should be apparent that the two are not comparable: Proof-Of-Stake is about (randomly distributed) authority, while Proof-of-Work is a clock.

In the context of the blockchain, Proof-of-Work is probably a
misnomer. The term is a legacy from the
Hashcash project, where it
indeed served to prove work. In the blockchain it is primarily about
verifiably taking time. When one sees a hash that satisfies the
difficulty, one knows it must have taken time. The method by which the
delay is accomplished is “work”, but the hash is primarily interesting
because it is a proof of *time*.

The fact that Proof-of-Work is all about time rather than work also suggests that there may be other similar statistical challenges that are time-consuming but require less energy. It may also mean that the Bitcoin hashrate is excessive and that the Bitcoin clock we described above could operate as reliably on a fraction of the hashrate, but it is the incentive structure that drives up the energy consumption.

Figuring out a way to pace ticks with less work is a trillion dollar problem, if you find one, please do let me know!

P.S. Special thanks to Sasha Trubetskoy of UChicago Statistics for the review and suggestions for the above text.