Encryption and integrity
Encryption and integrity
You should recall from Section 3.2 that integrity relates to assurance that there has been no unauthorised modification of a message and that the version received is the same as the version sent.
Activity 10
Pause here for a while and consider whether encryption can be used as an effective assurance of the integrity of a message.
There are other reasons why encryption alone does not provide a completely workable solution. As you have already seen, the encryption process carries overheads in terms of resources and for some applications it is preferable to send data in the clear. Also some network management protocols separate the confidentiality and integrity functions, so encryption is not always appropriate.
Other ways of providing assurance of integrity
Some other method of providing assurance of the integrity of a message is therefore needed – some kind of concise identity of the original message that can be checked against the received message to reveal any possible discrepancies between the two. This is the purpose of a message digest. It consists of a small, fixedlength block of data, also known as a hash value, which is a function of the original message. (The term ‘hash value’ can be used in other contexts.) The hash value is dependent on all the original data (in other words, it will change even if only one bit of the data changes) and is calculated by applying a mathematical function, known as a hash function, which converts a variablelength string to a fixedlength string. A simple example is a function that XORs together each byte of an input string to produce a single output byte.
A common use of a hash value is the storage of passwords on a computer system. If the passwords are stored in the clear, anyone gaining unlawful access to the computer files could discover and use them. This can be avoided if a hash of the password is stored instead. When a user enters a password at login the hash value is recalculated and compared with the stored value. The security of this method relies on the hash function being computationally irreversible. In other words, it is easy to compute a hash value for a given input string, but extremely difficult to deduce the input string from the hash value. Hash functions with this characteristic are known as oneway hash functions.
For a hash value to give an effective assurance about the integrity of data, it should also be computationally infeasible to generate another message that hashes to the same value. Hash functions that provide this characteristic are said to be collisionfree. The example of the XOR function given earlier is not collisionfree, since it would be simple to generate messages that would produce an identical hash.
The following very simple method gives an insight into how a oneway hash could be derived. (This example is not a practical method of producing hash values but does serve to demonstrate their function.)

Concatenate the message by removing all the spaces.

Arrange the message in blocks of five characters.

Pad the final block if it contains less than five characters. (For example, if the final block has only two characters it could be padded by adding AAA.)

Assign each block a numerical code from one of 26^{5} possible values according to the arrangement of letters. (See the example in the box below.)

Derive a value that is the modulo26^{5} sum of all the codes.
At the receiving end the hash value is recalculated using the same algorithm and is compared with the appended hash value received with the message. Any alterations in the original message should be revealed by a different hash value.
Box 3: A method of block coding
This is a worked example of a method of block coding the text VALUE

Code each letter according to its position in the alphabet (A=0, B=1, etc.), giving the number sequence 21, 0, 11, 20, 4.

Multiply each coded number by a power of 26 depending on its position in the sequence, giving: 21 × 26^{4}, 0 × 26^{3}, 11 × 26^{2}, 20 × 26^{1}, 4 × 26^{0}

Add together the resulting numbers: 6 596 496 + 0 + 7436 + 520 + 4 = 9 604 456
In practice, of course, message digest algorithms in common use are very much more complex than the method described above. Two are briefly described in Table 5.
Table 5 Examples of common message digest algorithms
Algorithm  Description 
MD5  Takes any arbitrary length input string and produces a fixed 128bit value. This is done by a method of blocking and padding and then performing four rounds of processing based on a combination of logical functions. Considered to be reasonably secure although potential weaknesses have been reported. 
SHA (secure hash algorithm)  Similar to MD5 but produces a 160bit hash value so is more resistant to brute force attacks^{1}. 
^{1}A brute force attack on a hash value can be either an attempt to find another message that hashes to the same value or an attempt to find two messages that hash to the same value.
A message authentication code is similar to a oneway hash function and has the same properties, but the algorithm uses the additional ingredient of a secret key, and therefore possession of the key to perform the check is necessary.