Hashing vs. Encryption
In the previous pages, we covered discussed encryption as a bidirectional function. Note how encryption was a two-way function. What we have sent is to be decrypted and interpreted. Where as Hashing, what we will cover below, is a one-way function. It scrambles plain text to produce a unique message digest. This is really helpful for things like passwords, where the password converts into a hash. If the hash matches the hash in the database, then the password is correct without the server ever knowing what your password is.
Hashing
At its core, a hash function transforms an input (also known as a 'message') into a fixed-size string of bytes, typically called a 'digest'. The uniqueness of a digest is tied directly to the input; even the smallest change to the input will create a significantly different digest. Hash functions are primarily employed to maintain data integrity.
While the terms "hash" and "digest" are often used synonymously, they're subtly different. Hashing is the process, similar to a blender at work, while the digest is the result, like the resultant smoothie. Like trying to identify the original fruits from a smoothie, it's nearly impossible to deduce the original input from the digest.
An effective hash function should meet five main criteria:
- It should accept inputs of any length.
- It should deliver a fixed-length output. For instance, regardless of the length of the password — whether it's "hi" or "v#BFBc8XUJT9ptb5" — the generated hash should be of the same length, for example, 256 bits.
- It should calculate the hash value swiftly for any input. Nobody should have to to wait ages to log in.
- It should be a one-way function. Continuing with our example, you shouldn't be able to figure out what the password is from the digest.
- It should be collision-resistant, which means that two different inputs should never result in the same output. Interestingly, the MD5 hash function had this problem, leading to its decreased usage in modern systems.
Types of Hashes
Algorithm Name | Hash Value Length (bits) | Still in Use | Replaced By | Score |
---|---|---|---|---|
Tiger | 192, 160, 128 | No | SHA-2 | 30 |
RIPEMD-160 | 160 | No | SHA-2 | 30 |
MD2 | 128 | No | MD5, SHA-1 | 20 |
MD4 | 128 | No | MD5, SHA-1 | 25 |
MD5 | 128 | No | SHA-2 | 30 |
SHA-1 | 160 | No | SHA-2 | 40 |
HMAC (SHA-1) | 160 | Yes | - | 70 |
SHA-2 | 224, 256, 384, 512 | Yes | - | 85 |
SHA-3 (Keccak) | 224, 256, 384, 512 | Yes | - | 90 |
HMAC (SHA-256) | 256 | Yes | - | 85 |
Whirlpool | 512 | Yes | - | 80 |
Blake2 | 224, 256, 384, 512 | Yes | - | 90 |
The SHA-2 and SHA-3 four main variants are very easy to remember as their has values are in their respective titles. For example:
- SHA3-224: Produces a SHA-3 hash value that is 224 bits long.
- SHA-256: Produces a SHA-2 hash value that is 256 bits long.
Keep in mind that although SHA-3 is the successor to SHA-2, it's not a replacement. Both SHA-2 and SHA-3 are considered secure and are in use.
The five fundamental concepts of hashing
A well-designed hash function has five main requirements:
Variable Input Length | The hash function should be able to take an input (or 'message') of any length |
Fixed Output Length | Regardless of the input size, the output (the 'digest') should always be of a fixed length |
Efficient Computability | It should be relatively quick and easy to compute the hash value for any given input |
Preimage Resistance | The hash function should be a one-way function. Original input cannot be derived from output. |
Collision Resistance | A collision occurs when two different inputs produce the same output hash. |
Example: MD5, an older hash function, is now considered insecure due to its susceptibility to collision attacks (amongst other discovered vulnerabilities). Two different inputs can generate the same hash, allowing an attacker to replace a legitimate file with a malicious one that has the same MD5 hash.
The core purpose of a hash function is to ensure data integrity – guaranteeing that data has not been altered during storage or transmission.
Salting
Rainbow table attacks are advanced methods employed by attackers to crack passwords. They are essentially upgraded versions of dictionary attacks. In these attacks, the assailant precomputes hashes for a multitude of possible passwords and stores them in a table, known as a "rainbow table". This technique accelerates the password cracking process significantly as compared to dictionary attacks, because the hashes have already been calculated.
MD5 is particularly vulnerable to rainbow table attacks due to its susceptibility to hash collisions – situations where two distinct inputs yield the same hash output. This property can be exploited in these types of attacks.
For example, look at this example I found on a public hacking forum: This user has extracted an MD5 hash and is asking others to run their rainbow table against this hash.
View image
This operation doesn't require much computational power, making it feasible for others to oblige, and thus potentially crack the hashed password.
View image
Using unique salts for each password hash greatly boosts defenses against rainbow table attacks. Although it doesn't render the system impervious, it increases the attack complexity, as each salt requires a distinct rainbow table, enhancing overall security.