What is blockchain?
A blockchain is a special type of database. You may also have heard the term distributed ledger technology (or DLT) – in many cases, they're referring to the same thing.
A blockchain has certain unique properties. There are rules about how data can be added, and once the data has been stored, it's virtually impossible to modify or delete it.
Data is added over time in structures called blocks. Each block is built on top of the last and includes a piece of information that links back to the previous one. By looking at the most up-to-date block, we can check that it has been created after the last. So if we continue all the way down the "chain," we'll reach our very first block – known as the genesis block.
To analogize, suppose that you have a spreadsheet with two columns. In the first cell of the first row, you put whatever data you want to hold.
The first cell's data is converted into a two-letter identifier, which will then be used as part of the next input. In this example, the two-letter identifier KP must be used to fill out the next cell in the second row (defKP). This means that if you change the first input data (abcAA), you'd get a different combination of letters in every other cell.
How are blocks connected?
What we discussed above – with our two-letter identifiers – is a simplified analogy of how a blockchain uses hash functions. Hashing is the glue that holds blocks together. It consists of taking data of any size and passing it through a mathematical function to produce an output (a hash) that's always the same length.
The hashes used in blockchains are interesting, in that the odds of you finding two pieces of data that give the exact same output are astronomically low. Any slight modification of our input data will give a totally different output. The fact that there aren't any known SHA256 collisions (i.e., two different inputs that give us the same output) is incredibly valuable in the context of blockchains. It means that each block can point back to the previous one by including its hash, and any attempt to edit older blocks will immediately become apparent.
Blockchains and decentralization
When you hear people talking about blockchain technology, they’re likely not just talking about the database itself, but the ecosystems built around blockchains.
As standalone data structures, blockchains are only really useful in niche applications. Where things get interesting is when we use them as tools for strangers to coordinate amongst themselves. Combined with other technologies and some game theory, a blockchain can act as a distributed ledger that's controlled by no one.
What this means is that no one has the power to edit the entries outside of the rules of the system (more on the rules shortly). In that sense, you could argue that the ledger is simultaneously owned by everyone: participants reach an agreement on what it looks like at any given moment.
Why do blockchains need to be decentralized?
You could, of course, operate a blockchain by yourself. But you'd end up with a database that's clunky in comparison to superior alternatives. Its real potential can be exploited in a decentralized environment – that is, one where all users are equal. That way, the blockchain can’t be deleted or maliciously taken over. It's a single source of truth that anyone can see.
What's the peer-to-peer network?
The peer-to-peer (P2P) network is our layer of users (or the generals in our previous example). There's no administrator, so instead of phoning into a central server anytime they want to exchange information with another user, the user sends it directly to their peers.
Normally, the server holds all the information that users need. When you access a website, you're asking its servers to feed you all the information. If the website goes offline, you won't be able to see them. However, if you downloaded all of the content, you could load it on your computer without querying the website.
In essence, that's what every peer does with the blockchain: the entire database is stored on their computer. If anyone leaves the network, the remaining users will still be able to access the blockchain, and share information with each other. When a new block is added to the chain, the data is propagated across the network so that everyone can update their own copy of the ledger.
What are blockchain nodes?
Nodes are simply what we call the machines connected to the network – they're the ones that store copies of the blockchain, and share information with other machines. Users don't need to manually handle these processes. Generally, all they need to do is download and run the blockchain’s software, and the rest will be taken care of automatically.
The above describes what a node is in the purest sense, but the definition can also encompass other users that interact with the network in any way. In cryptocurrency, for instance, a simple wallet application on your phone is what's known as a light node.
Public vs. Private blockchains
As you may know, Bitcoin laid the foundation for the blockchain industry to grow into what it is today. Ever since Bitcoin has started proving itself as a legitimate financial asset, innovators have been thinking about the potential of the underlying technology for other fields. This has resulted in an exploration of blockchain for countless use cases outside of finance.
Bitcoin is what we call a public blockchain. This means that anyone can view the transactions on it, and all it takes to join is an Internet connection and the necessary software. Since there aren't any other requirements for participation, we may refer to this as a permissionless environment.
In contrast, there are other types of blockchains out there called private blockchains. These systems establish rules regarding who can see and interact with the blockchain. As such, we refer to them as permissioned environments. While private blockchains may seem redundant at first, they do have some important applications – mainly in enterprise settings.
How do transactions work?
If Alice wants to pay Bob via bank transfer, she notifies her bank. Let’s assume that the two parties use the same bank for simplicity’s sake. The bank checks that Alice has the funds to perform the transaction, before updating its database (e.g., -$50 to Alice, +$50 to Bob).
This isn’t too dissimilar to what goes on with a blockchain. After all, it’s also a database. The key difference is that there isn’t a single party performing the checks and updating the balances. All of the nodes must do it.
If Alice wants to send five bitcoins to Bob, she broadcasts a message saying this to the network. It won’t be added to the blockchain straight away – nodes will see it, but other actions must be completed for the transaction to be confirmed.
Once that transaction is added to the blockchain, all of the nodes can see that it’s been made. They’ll update their copy of the blockchain to reflect it. Now, Alice can’t send those same five units to Carol (thus, double-spending), because the network knows that she’s already spent them in an earlier transaction.
There’s no concept of usernames and passwords – public-key cryptography is used to prove ownership of funds. To receive funds in the first place, Bob needs to generate a private key. That’s just a very long random number that would be virtually impossible for anyone to guess, even with hundreds of years at their disposal. But if he tells anyone his private key, they’ll be able to prove ownership over (and therefore spend) his funds. So it’s important that he keeps it secret.
What Bob can do, however, is derive a public key from his private one. He can then give the public key to anyone because it’s near-infeasible for them to reverse-engineer it to get the private key. In most cases, he’ll perform another operation (like hashing) on the public key to get a public address.
Pros and cons of blockchain technology
Properly-engineered blockchains solve a problem that plagues stakeholders in a number of industries, ranging from finance to agriculture. A distributed network presents many advantages over the traditional client-server model, but it also comes with some trade-offs.
One of the immediate benefits noted in the Bitcoin white paper is that payments could be transmitted without involving an intermediary. Subsequent blockchains have taken this even further, allowing users to send all kinds of information. Eliminating counterparties means that there’s less risk for users involved, and results in lower fees as there is no intermediary taking a cut.
As we mentioned earlier, a public blockchain network is also permissionless – there’s no barrier to entry since there’s no one in charge. If a prospective user can connect to the Internet, then they’re able to interact with other peers on the network.
Many would argue that the most important quality of blockchains is that they have a high degree of censorship-resistance. To cripple a centralized service, all that a malicious actor would need to do is target a server. But in a peer-to-peer network, every node acts as a server of its own.
A system like Bitcoin has over 10,000 visible nodes scattered around the world, making it virtually impossible for even a well-resourced attacker to compromise the network. It should be noted that there are many hidden nodes, too, which aren’t visible to the broader network.
Blockchains are not silver bullets to every problem. In being optimized for the advantages in the previous section, they end up lacking in other areas. The most obvious obstacle to mass adoption of blockchains is that they don’t scale very well.
This is true of any distributed network. Since all participants must stay in sync, new information can’t be added too fast as nodes would be unable to keep up. Therefore, developers tend to intentionally limit the speed at which the blockchain can update to ensure that the system remains decentralized.
For users of a network, this can manifest itself in lengthy waiting periods if too many people are trying to make transactions. Blocks can only hold so much data, and they’re not added to the chain instantly. If there are more transactions than can fit in the block, then any additional ones must wait for the next block.
Another possible con of decentralized blockchain systems is that they can’t easily be upgraded. If you’re building your own software, you can add new features as you please. You don’t need to work with others or ask for permission to make modifications.
In an environment with potentially millions of users, making changes is considerably more difficult. You could change some of the parameters of your node software, but you’d eventually find yourself separated from the network. If the modified software is incompatible with other nodes, they will recognize this and refuse to interact with your node.
Suppose you wanted to change a rule about how big blocks can be (from 1MB to 2MB). You could try sending this block to nodes you’re connected to, but they have a rule that says “do not accept blocks over 1MB”. If they receive anything bigger, they will not include it in their copy of the blockchain.
The only way to push changes is to have the majority of the ecosystem accept them. With major blockchains, there can be months – or even years – of intensive discussion in forums before changes can be coordinated.