The pros and cons of internal blockchains
I am often forwarded news articles of blockchain experiments run by banks or large companies, questioning “Why are they using a blockchain for this internal use-case?”.
Given that a blockchain is meant to replace a trusted external third party, or is meant to create trust between entities who don’t fully trust each other, an internal blockchain seems a contradiction in terms.
However, many of the publicly declared experiments, pilots and proof of concepts have focused on blockchains for internal use cases, ie a blockchain where there may be one or more nodes, but all under control of the same organisation, often within one department.
Although there has been much recent discussion about public (permissionless) vs private (permissioned) consortium blockchains, there has not been much debate on the virtues of internal blockchains.
By public (permissionless) I mean anyone can validate transactions and add blocks, and anyone can read data, eg The Bitcoin Blockchain, or Ethereum.
By private (permissioned) I mean that the entities who can add blocks are known and “allowed” by the rest of the network. They can come in two broad categories, firstly where participants are a group of entities such as an industry blockchain, secondly as an internal blockchain, where the block adders are all under the control of one organisation.
INTERNAL BLOCKCHAIN EXPERIMENTS
The primary reasons for setting up internal blockchain experiments seem to be:
(1) Pressure and budget to do something related to blockchains
(2) The relative ease of setting something up internally vs collaborating with external organisations (often competitors)
(3) Buying some experience with the trendy technology by getting hands dirty
Here are some thoughts on the pros and cons of an internal blockchain solution vs a traditional database. I welcome comments and clarifications from technical readers such as database administrators as I am not a technical expert!
Thinking about data security
Currently read-access to non-blockchain databases tend to be recorded in log files. You can read a blockchain’s data (which is stored in a regular database) independently from accessing it via the node it is connected to. A blockchain database in itself doesn’t have any inbuilt mechanisms that are improvements to this. You would still use log files to record who is reading your blockchain, so there is little difference between a blockchain and a normal database solution.
Writing (adding) data
Non-blockchain databases commonly use username and password based authentication and user entitlements to determine who can write data, and log files to record writing new data.
Blockchains additionally commonly use digital signatures when adding data, firstly at the transaction level, and secondly at the block-adding level (for private chains).
At the transactional level, for example in a bitcoin transaction, you digitally sign a payment to prove that you are indeed the owner of the coins that you are trying to spend. Many blockchains used for transferring digital assets use this mechanism, though blockchains used for non-transactional data don’t need to use this. Node software could in theory accept any data from anyone and add it to a block, without requiring a digital signature from the originator of the data.
At the block-adding level, for private chains, one permissions mechanism is for block adders to digitally sign blocks with a signature which proves who they are, so that the other validators will accept the block. Not that in public chains eg Bitcoin, this isn’t the case. In Bitcoin you can mine a block without explicitly stating who you are, though your IP address and your block reward bitcoin address both leak information.
Digital signatures can add an extra layer of security and non-repudiation over who wrote which changes. So a blockchain can add value in this respect.
Amending or deleting data
Non-blockchain databases commonly use username and password based authentication and user entitlements to determine who can amend data, and log files to record amend events.
Blockchains, running over several nodes, reject changes to data once written, except by consensus of a majority or all of the nodes or by abusing the “longest chain rule” whose purpose is to resolve near-simultaneous block creation.
This means you can reduce risk of rogue administrators changing historical data by having the blockchain run over nodes in different data centres, each with different teams of database administrators. To change historical data, (assuming the private blockchain is set up where the rules don’t allow a single node to add multiple blocks in a row) you would need a number of colluding teams to work together.
Blockchains are more secure than traditional databases in this respect. For non regulated entities who do not have archiving or business continuity requirements, a blockchain could be an elegant solution. However, with requirements on Financial Institutions to perform frequent backups and retain them for long periods of time, I can imagine that for a bank it could be relatively straightforward to do a diff between backups to detect if any historical data has changed. However, for each database you would need to build the diff logic, whereas for blockchains you get this immutability for free.
Archiving and backups
The above notwithstanding, I can imagine in the future a blockchain could be used as an alternative to archiving. If your blockchain has a number of nodes replicating immutable data around the world, do you need to do additional periodic backups?
Adding a node and getting it to synchronise to an existing blockchain is very easy. Install software, tell it about the other computers on the blockchain, and let it start downloading blocks and validating new transactions and blocks. I can imagine that this is easier and cheaper to do than with traditional enterprise database solutions. However I could be wrong.
Data segregation across borders
Blockchains replicate data between nodes. If some of your data needs to remain onshore in specific jurisdictions (eg client data in Singapore) then you will need to find a solution that fits. It’s relatively straightforward to do, it just need to be thought through. Something like client data onshore in a regular database, transaction data on an internal blockchain.
Are there Chinese Walls to consider? If so, a blockchain may not be a good solution given that the purpose is to replicate data. Privacy issues can be overcome with encryption where decryption keys are held where they are needed, however there is a dynamic between encrypting data and also allowing nodes to have the necessary visibility to validate it, especially for transactional data.
Think about this: You are on a blockchain network and add some data representing a transaction between parties A (you) and B (a counterparty). C is also on the blockchain. By putting the transaction on the blockchain, C can see that A is communicating with B. Also C can, in their own time, attempt to decrypt or analyse all the A <-> B messages, without penetrating the systems of A or B. Is that acceptable from a commercially sensitive perspective?
Access by third parties
I sometimes hear this justification for internal blockchains: it is easy to give access to a regulator or auditor – they can just tap into your blockchain and start from there. Yes, but is it really that much harder to give them read-access to your regular databases?
Another, probably better argument is around interoperability – if you have an internal database that is created as a blockchain (ie rows are added containing block hashes and there is some server software that can interact with the outside world in a peer-to-peer way) then it makes it easier to connect with other parties, should you want them to be able to write to this database. Care must be taken about what data and meta-data you are sharing with these external parties (see privacy section above).
Read-speed is fast in a blockchain. Given that the replicated data is stored in a traditional database format, you can read this just as you would a normal database.
If write-speed is important, or you are dealing with large volumes of data then blockchains don’t perform as well as regular databases.
However, I often hear concerns about blockchains based on Bitcoin‘s constraints. Bitcoin’s constraints don’t necessarily map to internal, private, non-proof-of-work blockchains: an internal blockchain hosted in data centres with good interconnectivity can run much faster than the 3 transactions per second of Bitcoin.
Net-net it is positive for technology that blockchains are being thrown at internal problems, even if initially there are no clear compelling reasons why a blockchain should be used. After all, it’s the tinkering, experimenting, creating and adapting that evolves technology to create great solutions. Given that private blockchains don’t have mining overhead and complexity, at this stage of the development of the technology if someone asks “Why a blockchain?”, a legitimate answer may actually be “Why not?”.
Please do comment with good reasons why blockchains should be used over traditional databases for internal use cases!
Nice overview Antony. Great to see some complex topical discussion simplied for a wider audience. Hope to see more coming out of your blog soon!
If you are a large institution you could potentially control processing costs by distributing work normally done in batches, as it comes in on the chain, lowering your average dailey MIPs… processor time on a mainframe is very expensive.
Good argument to get off mainframes but unsure why distributed ledgers help here. Thoughts?
Processors are expensive and the decision on how many to pay for is based largely on peak processing. If you can control the transaction flow then you can lower a major expense by spreading out the work and lowering the peaks. It would be at least 10% could be as high as 30 for year end processing. An internal chain may provide that ability.
There is no viable alternative for a large mainframe system to just move to another tech at this time the expense of changing hundreds of thousands of lines of code prevents it… noting that some of this code is twenty years old and still performs reliably and efficiently.
If the purpose of the block chain structure is provide the means to securely share a database why not… it works in real time and you can eliminate third parts from being involved.
Someone asked me today, why use a chain and not some other structure? I asked them back what other structure is there that two major corporations would use to share their data? Isn’t one of the benefits of using a chain based on not having to have any trust issues?