If you’ve been reading my previous posts, you already know that blockchains are simply a new type of database. That is, a database that can be shared directly, in the written sense, by a group of untrusting parties, without the need for a central administrator. This contrasts with traditional databases (SQL or NoSQL) that are controlled by a single entity, even if some type of distributed architecture is used within its walls.
I recently gave a talk on blockchains from an information security perspective, in which I concluded that blockchains are more secure than regular databases in some ways and less secure in others. Considering the role that centralized databases play in today’s technology stack, this made me think more broadly about the trade-offs between these two technologies. In fact, whenever someone asks me if MultiChain can be used for a particular purpose, my first response is always: “Could you do that with a regular database?” In more cases than you think, the answer is yes, for the following simple reason:
If trust and robustness are not an issue, there is nothing a blockchain can do that a regular database can’t do.
This is a key point where there is a lot of misunderstanding. In terms of the types of data that can be stored and the transactions that can be made with that data, blockchains do nothing new. And to be clear, this observation extends to “smart contracts” as well, despite their sexy name and image. A smart contract is nothing more than a piece of computer code that runs on each node of a blockchain, a decades-old technology called stored procedures does the same with centralized databases. (You also can’t use a blockchain if this code needs to initiate interactions with the outside world.)
The truth about blockchains is that while they have some advantages, they also have their disadvantages. In other words, like most technology decisions, the choice between a blockchain and a regular database comes down to a series of trade-offs. If you are blinded by the hype and deafened by the noise, you are unlikely to make that decision objectively. So I hope the following guide can help.
Disintermediation: advantages of blockchains
Table of Contents
The core value of a blockchain is to allow a database to be shared directly across trust boundaries, without the need for a central administrator. This is possible because blockchain transactions contain their own proof of validity and their own proof of authorization, rather than requiring some centralized application logic to enforce those restrictions. Transactions can therefore be verified and processed independently by multiple “nodes”, and the blockchain acts as a consensus mechanism to ensure those nodes remain synchronized.
Why is this disintermediation valuable? Because, although a database is only bits and bytes, it is also something tangible. The contents of a database are stored in the memory and disk of a specific computer system, and anyone with sufficient access to that system can destroy or corrupt the data. Therefore, the moment you entrust your data to a conventional database, you also become dependent on the human organization in which said database resides.
Now, the world is full of organizations that have earned this trust: governments and banks (mostly), universities, trade associations, and even private companies like Google and Facebook. In most cases, especially in the developed world, these work extremely well. I believe my vote has always been counted, no bank has ever stolen my money and I have yet to find a way to pay for better grades. So what’s the problem? If an organization controls an important database, it also needs a group of people and processes to prevent that database from being manipulated. People need to be hired, processes need to be designed, and all of this takes a lot of time and money.
So blockchains offer a way to replace these organizations with a distributed database, locked by intelligent cryptography. Like so many things before, they take advantage of the ever-increasing capabilities of computer systems to provide a new way to replace humans with code. And once it’s been written and debugged, the code tends to be much cheaper.
Confidentiality: Take advantage of centralized databases
As I mentioned, each node in a blockchain verifies and processes each transaction independently. A node can do this because it has complete visibility of: (a) the current state of the database, (b) the modification requested by a transaction, and (c) a digital signature that proves the origin of the transaction. This is certainly a clever new way to design a database, and it really works. So where’s the trick? For many applications, especially financial ones, the complete transparency enjoyed by all nodes is an absolute deal-breaker.
How do traditional database-based systems avoid this problem? Like blockchains, they restrict the transactions that certain users can make, but these restrictions are enforced from a central location. This way, the entire contents of the database only need to be visible in that location, rather than across multiple nodes. Data read requests also pass through this central authority, which can accept or reject them as it sees fit. In other words, if a traditional database has read-write control, a blockchain can have write-only control.
To be fair, there are many strategies available to mitigate this problem. These range from simple ideas such as conducting transactions under multiple blockchain addresses, to advanced cryptographic techniques such as confidential transactions and zero-knowledge proofs (now in development). However, the more information you want to hide in a blockchain, the greater the computational burden you will pay to generate and verify transactions. And no matter how these techniques develop, they will never surpass the simple and direct method of hiding data completely.
Robustness: blockchain advantages
A second benefit of blockchain-powered databases is extreme fault tolerance, which stems from their built-in redundancy. Each node processes each transaction, so no individual node is crucial to the database as a whole. Likewise, nodes are densely connected to each other on a peer-to-peer basis, so many communication links can fail before everything stops. The blockchain ensures that failed nodes can always catch up on the transactions they lost.
While it is true that regular databases offer many techniques for replication, blockchains take this to a whole new level. To get started, there is no setup required – just connect a few blockchain nodes and they will sync automatically. Additionally, nodes can be freely added or removed from a network, without any preparation or consequences. Finally, external users can send their transactions to any node, or multiple nodes simultaneously, and these transactions automatically and seamlessly propagate to everyone else.
This robustness transforms the economics of database availability. With regular databases, high availability is achieved through a combination of expensive infrastructure and disaster recovery. A primary database runs on high-end hardware that is closely monitored for problems, with transactions replicated to a backup system in a different physical location. If the primary database fails (for example, due to a power outage or catastrophic hardware failure), activity is automatically moved to the backup, which becomes the new primary database. Once the failed system is repaired, it is aligned to act as the new backup when needed. While all of this is doable, it is expensive and notoriously difficult to do well.
Instead, what if we had 10 blockchain nodes running in different parts of the world, all on commodity hardware? These nodes would be densely connected to each other, sharing transactions between peers and using a blockchain to ensure consensus. The end users generating the transactions connect to (say) 5 of these nodes, so it doesn’t matter if some communication links fail. And if one or two nodes fail completely on any given day, no one feels anything, because there are still more than enough copies to go around. As it happens, this combination of low-cost, high-redundancy systems is exactly how Google built its search engine so cheap. Blockchains can do the same for databases.
Performance: Take advantage of centralized databases
Blockchains will always be slower than centralized databases. It is not just because current blockchains are slow because the technology is new and not optimized, but it is a consequence of their very nature. When processing transactions, a blockchain must perform the same functions as a conventional database, but with three additional limitations:
- Signature verification. Every blockchain transaction must be digitally signed using a public-private cryptography scheme like ECDSA. This is necessary because transactions propagate between nodes on a peer-to-peer basis, so their origin cannot be proven any other way. The generation and verification of these signatures is computationally complex and constitutes the main bottleneck in products like ours. In contrast, in centralized databases, once a connection has been established, there is no need to individually verify each request that comes through it.
- Consensus mechanisms. In a distributed database like a blockchain, effort is needed to ensure that nodes in the network reach consensus. Depending on the consensus mechanism used, this could involve significant two-way communication or the management of forks and their subsequent rollbacks. While centralized databases must also deal with conflicting and aborted transactions, these problems are much less common when transactions are queued and processed in a single location.
- Redundancy. This does not refer to the performance of an individual node, but to the total amount of computing a blockchain requires. While centralized databases process transactions once or twice, in a blockchain each node in the network must process them independently. Therefore, much more work is done to obtain the same result.
The final result
Of course, there are other ways to compare blockchains to traditional databases. We could talk about the maturity of the source code, the attractiveness for developers, the breadth of the ecosystem, etc. However, none of these aspects are inherent to the technology itself. Therefore, when making a long-term decision about using a blockchain, the key question is: What is most important for my use case? Disintermediation and robustness? Or confidentiality and performance?
Analyzed from this perspective, many of the use cases currently being discussed make no sense. The main problem is usually confidentiality. Participants in a highly competitive market will logically prefer the privacy of a centralized database to revealing their activities to each other. This is accentuated if there is already a central trusted entity that can provide the neutral environment in which said database resides. While this central provider may involve some cost, it is largely justified by the value of the privacy that is preserved. The only incentive to migrate to blockchain would be strict new regulation.
However, blockchains have important use cases, where disintermediation and robustness are more important than confidentiality and performance. I’ll talk more about this in a later post, but the most promising areas we’ve seen so far are: (a) cross-company audit logs, (b) provenance tracking, and (c) lightweight financial systems. In all three cases, we have found people developing on MultiChain with a clear implementation vision, rather than just curiosity and experimentation. So if you’re looking for ways that blockchains can bring real value to your business, they could be a good starting point.
Dr. Gideon Greenspan is the founder and CEO of Coin Sciences, the company behind the popular MultiChain Platform for private blockchains.
