A gentle introduction to self-sovereign identity
In May 2017, the Indian Centre for Internet and Society think tank published a report detailing the ways in which India’s national identity database (Aadhaar) is leaking potentially compromising personal information. The information relates to over 130 million Indian nationals. The leaks create a great opportunity for financial fraud, and cause irreversible harm to the privacy of the individuals concerned.
It is clear that the central identity repository model has deficiencies. This post describes a new paradigm for managing our digital identities: se
Self-sovereign identity is the concept that people and businesses can store their own identity data on their own devices, and provide it efficiently to those who need to validate it, without relying on a central repository of identity data. It’s a digital way of doing what we do today with bits of paper. This has benefits compared with both current manual processes and central repositories such as India’s Aadhaar.
Efficient identification processes promote financial inclusion. By lowering the cost to banks of opening accounts for small businesses, financing becomes profitable for the banks and therefore accessible for the small businesses.
What are important concepts in identity?
There are three parts to identity: claims, proofs, and attestations.
An identity claim is an assertion made by the person or business:
“My name is Antony and my date of birth is 1 Jan 1901”
A proof is some form of document that provides evidence for the claim. Proofs come in all sorts of formats. Usually for individuals it’s photocopies of passports, birth certificates, and utility bills. For companies it’s a bundle of incorporation and ownership structure documents.
An attestation is when a third party validates that according to their records, the claims are true. For example a University may attest to the fact that someone studied there and earned a degree. An attestation from the right authority is more robust than a proof, which may be forged. However, attestations are a burden on the authority as the information can be sensitive. This means that the information needs to be maintained so that only specific people can access it.
What’s the identity problem?
Banks need to understand their new customers and business clients to check eligibility, and to prove to regulators that they (the banks) are not banking baddies. They also need keep the information they have on their clients up to date.
The problems are:
- Proofs are usually unstructured data, taking the form of images and photocopies. This means that someone in the bank has to manually read and scan the documents to extract the relevant data to type into a system for storage and processing.
- When the data changes in real life (such as a change of address, or a change in a company’s ownership structure), the customer is obliged to tell the various financial service providers they have relationships with.
- Some forms of proof (eg photocopies of original documents) can be easily faked, meaning extra steps to prove authenticity need to be taken, such as having photocopies notarised, leading to extra friction and expense.
This results in expensive, time-consuming and troublesome processes that annoy everyone.
What are the technical improvements?
Whatever style of overall solution is used, the three problems outlined above need to be solved technically. A combination of standards and digital signatures works well.
The technical solution for improving on unstructured data is for data to be stored and transported in a machine-readable structured format, ie text in boxes with standard labels.
The technical solution for managing changes in data is a common method used to update all the necessary entities. This means using APIs to connect, authenticate yourself (proving it’s your account), and update details.
The technical solution for proving authenticity of identity proofs is digitally signed attestations, possibly time-bound. A digitally signed proof is as good as an attestation because the digital signature cannot be forged. Digital signatures have two properties that make them inherently better than paper documents:
- Digital signatures become invalid if there are any changes to the signed document. In other words, they guarantee the integrity of the document.
- Digital signatures cannot be ‘lifted’ and copied from one document to another.
What’s the centralised solution?
A common solution for identity management is a central repository. A third party owns and controls a repository of many people’s identities. The customer enters their facts into the system, and uploads supporting evidence. Whoever needs this can access this data (with permission from the client of course), and can systematically suck this data into their own systems. If details change, the customer updates it once, and can push the change to the connected banks.
This sounds wonderful, and it certainly offers some benefits. But there are problems with this model.
What are the problems with centralised solutions?
1. Toxic data
Being in charge of this identity repository is a double-edged sword. On the one hand, an operator can make money, by charging for a convenient utility. On the other hand, this data is a liability to the operator: A central-identity system is a goldmine for hackers, and a cybersecurity headache for the operator.
If a hacker can get into the systems and copy the data, they can sell the digital identities and their documentary evidence to other baddies. These baddies can then steal the identities and commit fraud and crimes while using the names of the innocent. This can and does wreck the lives of the innocent, and creates a significant liability for the operator.
2. Jurisdictional politics
Regulators want personal data to be stored within the geographical boundaries of the jurisdiction under their control. So it can be difficult to create international identity repositories because there is always the argument about which country to warehouse the data and who can access it, from where.
3. Monopolistic tendencies
This isn’t a problem for the central repository operators, but it’s a problem for the users. If a utility operator gains enough traction, network effects lead to more users. The utility operator can become a quasi-monopoly. Operators of monopolistic functions tend to become resistant to change; they overcharge and don’t innovate due to a lack of competitive pressure. This is wonderful for the operator, but is at the expense of the users.
What’s the decentralised answer?
Is it a blockchain?
A blockchain is a type of distributed ledger where all data is replicated to all participants in real time. Should identity data be stored on a blockchain that is managed by a number of participating entities (say, the bigger banks)? No:
- Replicating all identity data to all parties breaks all kinds of regulations about keeping personal data onshore within a jurisdiction; only storing personal data that is relevant to the business; and only storing data that the customer has consented to.
- The cybersecurity risk is increased. If one central data store is difficult enough to secure, now you’re replicating this data to multiple parties, each with their own cybersecurity practices and gaps. This makes it easier for an attacker to steal the data.
What if the identity data were encrypted?
- Encrypted personal data can still fall foul of personal data regulations.
- Why would the parties (eg banks) store and manage a bunch of identity data that they can’t see or use? What’s the upside?
So what’s the answer?
The emerging answer is “self-sovereign identity“. This digital concept is very similar to the way we keep our non-digital identities today.
Today, we keep passports, birth certificates, utility bills at home under our own control, maybe in an “important drawer”, and we share them when needed. We don’t store these bits of paper with a third party. Self-sovereign identity is the digital equivalent of what we do with bits of paper now.
How would self-sovereign identity work for the user?
You would have an app on a smartphone or computer, some sort of “identity wallet” where identity data would be stored on the hard drive of your device, maybe backed up on another device or on a personal backup solution, but crucially not stored in a central repository.
Your identity wallet would start off empty with only a self-generated identification number derived from public key, and a corresponding private key (like a password, used to create digital signatures). This keypair differs from a username and password because it is created by the user by “rolling dice and doing some maths” rather than by requesting a username/password combination from a third party.
At this stage, no one else in the world knows about this identification number. No one issued it to you. You created it yourself. It is self-sovereign. The laws of big numbers and randomness ensure that no one else will generate the same identification number as you.
You then use this identification number, along with your identity claims, and get attestations from relevant authorities.
You can then use these attested claims as your identity information.
Claims would be stored by typing text into standardised text fields, and saving photos or scans of documents.
Proofs would be stored by saving scans or photos of proof documents. However this would be for backward compatibility, because digitally signed attestations remove the need for proofs as we know them today.
Attestations – and here’s the neat bit – would be stored in this wallet too. These would be machine readable, digitally signed pieces of information, valid within certain time windows. The relevant authority would need to sign these with digital signatures – for example, passport agencies, hospitals, driving licence authorities, police, etc.
Need to know, but not more: Authorities could provide “bundles” of attested claims, such as “over 18”, “over 21”, “accredited investor”, “can drive cars” etc, for the user to use as they see fit. The identity owner would be able to choose which piece of information to pass to any requester. For example, if you need to prove you are over 18, you don’t need to share your date of birth, you just need a statement saying you are over 18, signed by the relevant authority.
Sharing this kind of data is safer both for the identity provider and the recipient. The provider doesn’t need to overshare, and the recipient doesn’t need to store unnecessarily sensitive data – for example, if the recipient gets hacked, they are only storing “Over 18” flags, not dates of birth.
Even banks themselves could attest to the person having an account with them. We would first need to understand what liability they take on when they create these attestations. I would assume it would be no more than the liability they currently take on when they send you a bank statement, which you use as a proof of address elsewhere.
Data would be stored on the person’s device (as pieces of paper are currently stored at home today), and then when requested, the person would approve a third party to collect specific data, by tapping a notification on their device, We already have something similar to this – if you have ever used a service by “linking” your Facebook or LinkedIn account, this is similar – but instead of going to Facebook’s servers to collect your personal data, it requests it from your phone, and you have granular control over what data is shared.
Conclusion – and distributed ledgers
Who would orchestrate this? Well perhaps this is where a distributed ledger may come in. The software, the network, and the workflow dance would need to be built, run, and maintained. Digital signatures require public and private keys that need to be managed, and certificates need to be issued, revoked, refreshed. Identity data isn’t static, it needs to evolve, according to some business logic.
A non-blockchain distributed ledger would be an ideal platform for this. R3’s Corda (Note: I work at R3) already has many of the necessary elements – coordinated workflow, digital signatures, rules about data evolution, and a consortium of over 80 financial institutions experimenting with this exact self-sovereign identity concept.
This is an excellently written and well thought out solution to the problem of personal sovereignty in the digital age! Thank you for sharing it Anthony. Question: It is proper to allow Authorities can revoke attestations, shouldn’t individuals have the ability to revoke access to and use of provided data? For example: An individual contracts with another entity for a transactional service that may require access to some sensitive information but only for and during the provision of that service. Should there be mechanisms to (a) allow access to that information in a time-bound manner, (b) revoke access to that information, and (c) enforce a and b? It seems that sovereign digital identity should treat identity information as personal property and allow management of it accordingly. Thoughts?
You asked who is working on this. .. well the folks gathering at the Internet Identity Workshop have been working on the ideas underlying this for the last 12 years. http://www.internetidentityworkshop.com Our next one is in October. If you are coming all the way from India I can give you a free ticket.
Their is also the work being done at Rebooting the Web of Trust – http://www.weboftrust.info the next one is happening in Boston. The DID https://github.com/WebOfTrustInfo/rebooting-the-web-of-trust-spring2017/blob/master/topics-and-advance-readings/did-family-of-specifications.md work in particular is relevant to what you are talking about as is the work of the Distributed Identity Foundation http://www.identity.foundation
Great work Antony. Do you have a solution for people who are too poor to own a device for storing their identity wallet (this applies to about 1 billion people) ?
“1. Replicating all identity data to all parties breaks all kinds of regulations about keeping personal data onshore within a jurisdiction; only storing personal data that is relevant to the business; and only storing data that the customer has consented to.
2. The cybersecurity risk is increased. If one central data store is difficult enough to secure, now you’re replicating this data to multiple parties, each with their own cybersecurity practices and gaps. This makes it easier for an attacker to steal the data.”
Blockchain is a distributed leger but it does not mean that ledger data will be stored in nodes out of jurisdiction. Also, if data in distributed nodes is a security flaw i don’t see how your **distributed** ledger would be safer
I’ve just re-read this excellent article.
But if, in principle, everyone can know my Public Key, and officials only see my Public Key, how do these officials know that any one quoting my Public Key and therefore claiming to be me and issuing requests/instructions to them, is really me?
On the other hand, if authorities can see my Private Key, what prevents any employee of that authority from misusing their knowledge of my Private Key?
If, as I assume is the case here, the Private Key and Public Key are dependent on each other, so that the Public Key somehow (how?!) cannot be “triggered”/released to another party without the Private Key (known only to me) being used, who “organises” that dependency, and what prevents the “organisers” from misusing their power?
At the end of describing and discussing any and every system of identification, you come back to the question: who is in charge of the system, who operates the system, and how do you prevent these individuals from misusing their knowledge? Most frauds are committed by insiders….
So who watches over? Thats always the question. Of course there need to be oversight, but the same system of anonmyisation could be used to manage that. In that way, noone is an insider.
This system can work. Makling it work is however a substantial political enterprise, this wonderful presentation makes it clear that it is no longer a technological challenge. It is politicial and I would suggest it is more important than climate, since only with the democratic freedom that this would give (and which we currently lack.imagine voting with your SSI) can we shift the inertia on climate.
But it has to be for all.
Are there any examples of non-blockchain solutions for SSI? It’s mentioned at the end of the article, but R3 Corda seems to be a blockchain company. With the carbon footprint, I don’t think blockchain is a viable solution for a global SSI network.