7.4 Message Digest FunctionsMessage digest functions distill the information contained in a file (small or large) into a single large number, typically between 128 and 256 bits in length. (See Figure 7-4.) The best message digest functions combine these mathematical properties:
Figure 7-4. A message digest functionMessage digests are also called one-way hash functions because they produce values that are difficult to invert, resistant to attack, effectively unique, and widely distributed. Many message digest functions have been proposed and are now in use. Here are a few:
Besides these functions, it is also possible to use traditional symmetric block encryption systems such as the DES as message digest functions. To use an encryption function as a message digest function, simply run the encryption function in cipher feedback mode. For a key, use a key that is randomly chosen and specific to the application. Encrypt the entire input file. The last block of encrypted data is the message digest. Symmetric encryption algorithms produce excellent hashes, but they are significantly slower than the message digest functions described previously. 7.4.1 Message Digest Algorithms at WorkMessage digest algorithms themselves are not generally used for encryption and decryption operations. Instead, they are used in the creation of digital signatures, message authentication codes (MACs), and encryption keys from passphrases. The easiest way to understand message digest functions is to look at them at work. The following example shows some inputs to the MD5 function and the resulting MD5 codes: MD5(The meeting last week was swell.)= 050f3905211cddf36107ffc361c23e3d MD5(There is $1500 in the blue box.) = 05f8cfc03f4e58cbee731aa4a14b3f03 MD5(There is $1100 in the blue box.) = d6dee11aae89661a45eb9d21e30d34cb Notice that all of these messages have dramatically different MD5 codes. Even the second and third messages, which differ by only a single character (and, within that character, by only a single binary bit), have completely different message digests. The message digest appears almost random, but it's not. Let's look at a few more message digests: MD5(There is $1500 in the blue bo) = f80b3fde8ecbac1b515960b9058de7a1 MD5(There is $1500 in the blue box) = a4a5471a0e019a4a502134d38fb64729 MD5(There is $1500 in the blue box.) = 05f8cfc03f4e58cbee731aa4a14b3f03 MD5(There is $1500 in the blue box!) = 4b36807076169572b804907735accd42 MD5(There is $1500 in the blue box..)= 3a7b4e07ae316eb60b5af4a1a2345931 Consider the third line of MD5 code in this example: you can see that it is exactly the same as the second line of the first MD5 example. This is because the same text always produces the same MD5 code. Message digest functions are a powerful tool for detecting very small changes in very large files or messages. Calculate the MD5 code for your message and set it aside. If you think that the file has been changed (either accidentally or on purpose), simply recalculate the MD5 code and compare it with the MD5 that you originally calculated. If they match, you can safely assume that the file was not modified.[17]
In theory, two different files can have the same message digest value. This is called a collision. For a message digest function to be secure, it should be computationally infeasible to find or produce these collisions. 7.4.2 Uses of Message Digest FunctionsMessage digest functions are widely used today for a number of reasons:
Because of their properties, message digest functions are also an important part of many cryptographic systems in use today:
Considering the widespread use of message digest functions, it is disconcerting that there is so little published theoretical basis behind most message digest functions. 7.4.3 HMACA Hash Message Authentication Code (HMAC) function is a technique for verifying the integrity of a message transmitted between two parties that agree on a shared secret key. Essentially, HMAC combines the original message and a key to compute a message digest function.[19] The sender of the message computes the HMAC of the message and the key and transmits the HMAC with the original message. The recipient recalculates the HMAC using the message and the secret key, then compares the received HMAC with the calculated HMAC to see if they match. If the two HMACs match, then the recipient knows that the original message has not been modified because the message digest hasn't changed, and that it is authentic because the sender knew the shared key, which is presumed to be secret (see Figure 7-5).
HMACs can be used for many of the same things as digital signatures, and they offer a number of advantages, including:
However, HMACs do have an important disadvantage over digital signature systems: because HMACs are based on a key that is shared between the two parties, if either party's key is compromised, it will be possible for an attacker to create fraudulent messages. Figure 7-5. Using an HMAC to verify the authenticity and integrity of a message7.4.4 Attacks on Message Digest FunctionsThere are two kinds of attacks on message digest functions. The first is finding two messages—any two messages—that have the same message digest. The second attack is significantly harder: given a particular message, the attacker finds a second message that has the same message digest code. There's extra value if the second message is a human-readable message, in the same language, and in the same word processor format as the first. MD5 is probably secure enough to be used over the next 5 to 10 years. Even if it becomes possible to find MD5 collisions at will, it will be very difficult to transform this knowledge into a general-purpose attack on SSL. Nevertheless, to minimize the dependence on any one cryptographic algorithm, most modern cryptographic protocols negotiate the algorithms that they will use from a list of several possibilities. Thus, if a particular encryption algorithm or message digest function is compromised, it will be relatively simple to tell Internet servers to stop using the compromised algorithm and use others instead. |