A gentle introduction to bitcoin mining
Recently over dinner, I was asked to explain bitcoin mining, and I struggled as it is entangled with a number of other concepts. Here’s my attempt at breaking it down into bite-sized pieces.
What is bitcoin mining?
Mining is the process of writing pages (blocks) of bitcoin transactions into the bitcoin ledger, called ‘The Bitcoin Blockchain’, and getting rewarded with newly created bitcoins.
To understand this in more detail, the rest of the post describes:
- How do bitcoin transactions work?
- Why is mining needed in bitcoin?
- Why do miners mine?
- What is this ‘computationally expensive’ guessing game?
- Why pay rewards in BTC instead of USD?
- Who mines?
- What can and can’t miscreants do?
If you are new to bitcoin, it might be worth having a quick read of “A gentle introduction to bitcoin”.
How do bitcoin transactions work?
The process is:
- Make a payment (a bitcoin transaction)
- Wait for it to be mined in a block (average 10 mins)
- Wait for more blocks to be mined on top (average 10 mins per block)
1. Make a payment. When you make a bitcoin payment, the transaction message is sent to the network and passed around all the network participants (called ’nodes’), and remains in an ‘unconfirmed’ state. This means the nodes have seen that the payment has been initiated, and they have validated it according to certain technical and business logic rules, but it isn’t yet written into anyone’s bitcoin blockchain ledger.
Unconfirmed transaction = valid, known transaction, but not yet included in the ledger.
2. Wait for it to be mined in a block (average 10 mins). Miners take the list of unconfirmed transactions (specifically, those that they know about), and they bundle them into a block, which is just a list of transactions plus some other data.
They then get to work ‘mining’ the block which means playing a guessing game to find a random number (more later).
If they guess right, then the block is published to the rest of the network. The computers on the network validate that the block meets the criteria, and then ignore it or store it into their blockchains. The competition then starts again with the unconfirmed transactions that have accumulated since.
The network adjusts the difficulty of the guessing game to target a block being created every 10 mins or so, irrespective of the amount of computing power in the network.
3. Wait for more blocks to be mined on top (average 10 mins per block). The next block that is mined on top of the one with your transaction will refer to the previous block (hence, ‘blockchain’). The more blocks that have been built on top of the one with your transaction, the more ‘baked’ into the blockchain it is, and so the harder it is to unwind through block-reorganisation attacks (more later).
Unconfirmed transaction -> Confirmed transaction (1 block) -> Confirmed transaction (many blocks)
The current advice suggests that after 6 blocks, the chances of the transaction being unwound due to a competing longer chain replacing your blocks is very small. If you are receiving a payment, then the higher the value your payment, the longer you may want to wait to reduce the chance of your payment being unwound.
Why is mining needed in bitcoin?
There are two parts to this. First you need a way to get transactions into the ledger, secondly you need a way to make it expensive for miscreants to add dishonest blocks.
Ledger addition. Transactions are added to the ledger in blocks so as to create some sort of time order to the transactions. In bitcoin you can’t trust the timestamp of any particular participant, and there is no ‘master clock’ to trust, so block order is the equivalent of time order.
Financial deterrent. This is about the guessing game, called “Proof of work”. You don’t actually need the guessing game to add blocks to a blockchain. However, the guessing game makes it computationally expensive (therefore financially expensive) to add blocks. This cost acts as a deterrent to miscreants who would otherwise want to add their dishonest blocks. So long as most of the network is ‘honest’, then the dishonest parties will have a tough time creating rogue blocks.
“Why proof of work?”, in three acts:
- Anyone can create blocks on an “open” network.
- As you can’t trust anyone specifically, each individual node has to assume that the ‘majority’ of the rest of the network is right.
- So to dominate the network, you just need to create many aliases who are all under your control and all agree with each other. This kind of domination-by-numbers is called a ‘Sybil attack’.
- It is cheap and easy to spawn validators who all agree with each other.
- Therefore it is very cheap to bully the network.
- So for a network to be secure against this, you need to have a more expensive way to bully the network.
- Computational power is more expensive and requires investment and upkeep.
- Therefore use majority-by-computational-power instead of majority-by-numbers.
- So miscreants will need to spend a lot more money to dominate the network.
- The name given to a challenge that is computationally expensive for the sake of it, is called a “Proof of work” challenge.
Why do miners mine?
Mining reward = Voluntary transaction fees + Block reward (currently 25 BTC per block)
When you mine a block, get to collect any voluntary transaction fees from the transactions you have included. You also get to write one transaction paying yourself some BTC (currently 25 BTC, and reducing to 12.5 BTC in the middle of 2016). This is called a ‘block reward’ or ‘coinbase transaction’ (not to be confused with the American company called “Coinbase” which operates under a UK legal entity “Coinbase UK, Ltd”).
This is the ‘minting process’ i.e. how bitcoins are created. The reward decreases with time, and in theory, transaction fees will replace the block reward.
Transaction fees are not mandatory (hence the “bitcoin transactions are free” mantra) but miners will seek out transactions containing fees, and preferentially add them to blocks that they are creating. If there are more unconfirmed transactions than can fit in a block, rational miners will mine the ones with the highest transaction fees first.
What is this ‘computationally expensive’ guessing game?
Miners spend a lot of computing power trying to guess a number, which when added to a block and put through an algorithm, outputs a ‘hash’ that meets certain criteria.
A hash is a fingerprint of data. It’s easy to make a hash from some data but computationally impossible to create the data from the hash. Hashes look random compared with the data put in.
You can play with hashing here: Go to http://www.xorbin.com/tools/sha256-hash-calculator and type some data into the big box. You’ll see the hash in the smaller box. I typed “What does the hash of this look like?”:
If you change just one part of the data, the hash looks entirely different. I added a question mark:
By changing the data slightly, try to find a hash starting with 0000000. Tricky eh?
By adding “-17” to the sentence, I found something that gave a hash starting with one zero:
What does the hash of this look like?-17 = 0fd82107e6e73b6f369853da3b53d4a93e8be1e5b3a4dd7da2b4ea644774bc80
I kept going, and to find something that gave a hash starting with a double zero, it took 272 attempts:
What does the hash of this look like?-272 = 00629a604a7ec6b1f05e7703c57197ed6119a6282e9b5f750e14a1500578d3fd
Bitcoin block mining. Bitcoin mining is essentially the same game, where you tweak the input data (the block header) so that you get an output hash that matches what is required by the network at that point in time.
A recent bitcoin block #372910 was ‘solved’ because the hash was 000000000000000000b037a61e47df14b035199b5a2d464691b9456394bc07da – this had enough zeroes to satisfy the network at this time*.
* More accurately (for pedants) the block header containing the nonce is hashed twice using the SHA-256 hashing algorithm, and had to meet a number smaller than the target number determined by the network difficulty of 54,256,630,327.89 (at block #372910).
Further fun. If you are up for some light programming, there is an excellent guide to playing the guessing game in Python on Alex Gorale’s blog.
Why pay rewards in BTC instead of USD?
Satoshi Nakamoto, the proposer of bitcoin, recognised that if you want lots of people to spend hardware and energy creating this network, you need to incentivise them: i.e. you need to pay them. The white paper is here, and well worth a read.
How do you pay anonymous participants, without creating some sort of power structure? Any source of funding provided by some entity (e.g. if a company or government paid miners) would give that entity censorship rights and some control over who mined, and what gets mined.
Satoshi realised that an intrinsic source of funding, where a payment is paid by the system rather than by any external party, would be the answer. This is why miners are paid by the system, in tokens which have a value that is related to the size and security of the system. Theoretically, the more valuable the tokens become, the more money can be spent mining, leading to an increase in security and an increase in the value of the network.
Anyone can “participate” in the mining activity. You just need to download some software and run it. Your computer will then start taking transactions that it receives through the bitcoin network, and it will bundle them into blocks, and start mining the block.
Your chance of mining a block is somewhat proportional to the amount of computing power you throw at it, because mining is a guessing game, and faster computers guess more quickly. It is also related to how fast your internet connection is, because once you have created a valid block, you want to make sure that everyone else incorporates it before someone else with a faster internet connection mines his own block and distributes his block more quickly.
In practice, successful miners form groups, or pools, and combine their processing power. If they win a block, the reward gets shared between participants. This is similar to forming a lottery syndicate, so you win less, but more often, and your income becomes lumpy.
Currently, the top 10 mining pools consistently create about 90% of the blocks, and China-based pools create more than 60% of the blocks. Pools are generally controlled by the “pool operator” which is a person or a few people. So despite the rhetoric of bitcoin being decentralised, it is controlled by a handful of people in China. See this Financial Times article for further reading: Bitcoin OPEC
The decentralisation of bitcoin, although romantic in theory, doesn’t seem to be working properly in practice.
A very brief history of mining
In 2009, at first people could mine successfully on their laptops and home computers, using the CPU (Central Processing Unit) to do the calculations. There seemed to be a gentleman’s agreement not to use more powerful GPUs (graphics cards, the chips that make screens work) that were more efficient and faster at running this specific calculation, but harder to set up. However that gentleman’s agreement seems to have had broken down, and GPU mining made CPU mining obsolete and caused drove a large increase in mining difficulty between 2010-12.
Then as the price of bitcoin, and so the value of the reward, increased, people started investing in mining equipment, and began manufacturing chips called ASICs (Application-Specific Integrated Circuits) that were good for nothing except hashing / mining (so take popular comparisons with the world’s supercomputers with a pinch of salt). This was the next revolution in hashing power, starting in 2013.
I recommend this article which describes the history of mining better than I can: A guide to bitcoin mining by VICE Motherboard.
What can and can’t miscreants do?
A dishonest miner can:
- Refuse to relay valid transactions to other nodes.
- Attempt to create blocks that include or exclude specific transactions of his choosing.
- Attempt to create a ‘longer chain’ of blocks that make previously accepted blocks become ‘orphans’ and not part of the main chain.
- Create bitcoins out of thin air.*
- Steal bitcoins from your account.
- Make payments on your behalf or pretend to be you.
That’s a relief.
*Well, he can, but only his version of the ledger will have this transactions. Other nodes will reject this, which is why it is important to confirm a transaction across a number of nodes.
With transactions, the effect a dishonest can have is very limited. If the rest of the network is honest, they will reject any invalid transactions coming from the baddie, and they will hear about valid transactions from other honest nodes, even if the miscreant is refusing to pass them on.
With blocks, if the miscreant has sufficient block creation power (and this is what it all hinges on), he can delay your transaction by refusing to include it in his blocks. However, your transaction will still be known by other honest nodes as an ‘unconfirmed transaction’, and it will eventually be included in one of their blocks.
Worse though, is if the miscreant can create a longer chain of blocks than the rest of the network, and can invoke the “longest chain rule” to kick out the shorter chains. This lets him unwind a transaction. Here’s how:
- Create two payments with the same bitcoins: one to an online retailer, the other to yourself (another address you control).
- Only broadcast the payment to the retailer.
- When the payment gets added in an honest block, the retailer sends you goods.
- Secretly create a longer chain of blocks which swaps out the payment to the retailer, and swaps in the payment to yourself.
- Publish the longer chain. If the other nodes are playing by the “longest chain rule” rule, then they will ignore the honest block with the retailer payment, and continue to build on your longer chain. The honest block is said to be ‘orphaned’ and does not exist to all intents and purposes.
- The original payment to the retailer will be deemed invalid by the honest nodes because those bitcoins have already been spent (in your longer chain).
This is called a “double spend” because the same bitcoins were spent twice – but the second one was the one that became part of the eventual blockchain.
To conclude, bitcoin mining is the theoretically decentralised process where anyone can add a block of transactions to the bitcoin blockchain, without needing permission from any authority, and get paid in bitcoins for it. It is made deliberately difficult, using proof of work as a defence against Sybil attacks. The mining difficulty increases with the network hashing power, so the more processing power of the whole network there is, the the more power someone needs to assert control over the network.
It works well until any entity or coordinated group controls too much of the hashing power, at which point they can control various aspects of the system. Currently 90% of blocks are mined by known ‘pools’ or syndicates of miners, and if a few pools join together, they could effect changes and assert control over the network.