1.1 A Brief Overview of Blockchain
Blockchain is an immutable ledger which is shared and visible to all the participants of that specific system. Blockchain consists of blocks of transactions which are arranged in some specific order. The addition of transactions into the block is based on some consensus algorithm. Thus, the transactions to be added into the block and order of those transactions is decided based on the consensus algorithm. Once the data is inserted into the blockchain as a block, its subsequent block contains a hash of its previous block. Thus, making it hard to change any intermediate blocks.
Initially, blockchain was thought to be used only in cryptocurrency systems like Bitcoin, Ethereum, etc. But nowadays, Blockchain is used in various sectors to provide transparency and security to the data.
1.1.1 Data Structures
Linked list can be thought of as a list of nodes chained together using the address of the next node. The following diagram shows a singly linked list. Here, the elements are stored in a linear fashion but unlike arrays these do not facilitate spatial locality. That means, even though the elements can be accessed in a linear form, but the memory locations in which these are stored are not necessarily contiguous. To take an example, Node A in the diagram is location 0x100, but Node B which is the next node in the linked list is stored at a memory location 0x900.
The number mentioned below each node is the address of that node.
This data structure forms the basis of blockchain technology. Although, in blockchain technology, addresses are not used as a chaining tool because in the Internet scenario these addresses become obsolete, i.e. when a machine B is used, addresses specific to machine A will be of no use to it.
Thus, we now need to fix upon a mechanism which can be used globally without becoming obsolete. To ensure this, Hashing is used to connect one node to the other in case of blockchain. The details of Hashing will be provided in the subsequent sections.
1.1.2 Cryptography
Cryptography is used to protect the information, be it of any kind, so that only the person intended can have access to the information in a meaningful form. This protection is done using a technique called hidden writing. Here, codes are used to convert the information into something which seems meaningless to the rest of the world, but with proper use of a tool called a KEY, this information can be converted back into a meaningful one by the intended person (Stallings, 2013).
The following terms will be used in the subsequent sections.
- •
Plaintext: The original message
- •
Ciphertext: The coded meaningless message
- •
Encryption: The process of converting from plaintext to ciphertext using an encryption key
- •
Decryption: The process of converting from ciphertext to plaintext using a decryption key.
Now, based on the key used the encryption algorithms can be classified into two main categories:
- •
Private key Cryptography: Here, the key used is the same for both encryption and decryption. So, it has to be kept a secret from the rest of the world.
- •
Public key Cryptography: Here, a separate key is used for encryption and decryption. Each user has a pair of keys, public and private. The public key can be advertised over the internet while the private key needs to be kept safe.
Security of the data basically depends on 3 main pillars: Confidentiality, Integrity and Availability. Confidentiality basically means that to be secure, unauthorized access to the information must be prevented. Similarly, the information also needs to be protected from unauthorized changes which are made to it, which is maintaining Integrity of the data. Availability as the word suggests means that the information must always be available for access to its authorized users. To implement the above pillars of security, various techniques are used which provide one or more of the above services. In this section we are only interested in Cryptographic Hashing because of its use in the implementation of the blockchain. Hashing is used to provide integrity. Hashing basically uses a hash function to convert any form of data into a fixed size output which is called Message Digest or fingerprint of the data. The advantage of using this is even if 1 bit of input data is changed, its corresponding Message Digest is completely different. Another advantage is that it is very less likely that the Message Digest of two non-identical messages are the same, which is collision resistance. So, it is not feasible to guess the data given its Message Digest.