Python: how to compare hashlib lib and hmac lib message digests

672 views Asked by At

I searched around and couldn´t find an answer for this. I noticed I can generate message digests using, let´s say blake2b algorithm using hashlib library or hmac library, which uses the digestmod indicated by hashlib. So:

plainMsg = "this is a plaintext message"
hashlib_hashed = hashlib.blake2b(key=b'super secret key')
hashlib_hashed.update(plainMsg.encode())
hmac_hashed = hmac.new(b'super secret key', digestmod=hashlib.blake2b)
hmac_hashed.update(plainMsg.encode())

print(hashlib_hashed.hexdigest())
print(hmac_hashed.hexdigest())

In my mind they should generate the same message digests as I´m using the same algorithm, same key and the same plaintext. But they generate two different digests:

ec0d0ab13d7e7f3b62d742aa92078a4a14346ee6ee352e27c8814e4bf6361556fdc3d301e100b5a2c90c5596c4b2bb72c887c6b6aa92fb41752f6b52105ce13b
b632045e745550e5b9da6d411c013c978cb8120847260eb8fda9c8885368a5eaba80cd74ad95a51b1a4bde1f47cccb5a2e4591e9935126f673479c7474c2be97

I initially though it would have to do with salt, as I didn´t use one with hashlib.blake2b(), so I guess it´s empty since salt=b''. But using hmac I didn´t find how to set up a salt. So, can anyone explain?

2

There are 2 answers

0
ShadowRanger On

The HMAC algorithm is more than just "hash key followed by message"; the key is padded to match the algorithm's block size, each byte is then xor-ed with a fixed "ipad" (0x36), the hash is computed from that value followed by the text. Then the original padded key is xor-ed with a fixed "opad" (0x5C) and the hash is again computed from that new value followed by the hash from the previous step.

Point is, it's not as straightforward as you think it is. You can look at the contents of the hmac.HMAC class to see the additional rigmarole Python does to follow the HMAC RFC.

0
Martijn Pieters On

You are using two different algorithms to create a Message Authentication Code, or MAC.

When you are using the hmac module, you are creating a hashed MAC, or HMAC. Here, the key is used twice (with a different XOR mask each time) to prepend a value to the data, and a hash function (provided by the hashlib library) is used to process the input data a block at a time to 'compress' the data first into in inner value, then an outer value after prepending the second key, in a two-step process. This makes the algorithm very flexible as any block hash can be adapted for this technique.

BLAKE2, with a key, can also be used to create a MAC, but it is the hash function itself that then uses that key to produce a 'secret' hash output, one that can only be verified with the same key, so producing a secure signature too. It does this by making the key the first block for the iterative hashing operation.

But Blake2 without a key, is just another hash function, like SHA256 and others, and when you use it in an HMAC, the different approach will produce a different result. The two are not compatible, because the two algorithms use their key very differently.

Quoting from the RFC 7693 – BLAKE2 Crypto Hash and MAC:

BLAKE2 does not require a special "HMAC" (Hashed Message Authentication Code) construction for keyed message authentication as it has a built-in keying mechanism.

and from the BLAKE2 section of the hashlib documentation:

BLAKE2 supports keyed mode (a faster and simpler replacement for HMAC), [...]

The salt argument to the hashlib.blake2b() function is a different feature of the BLAKE2 hashing algorithm, where otherwise you'd just prepend the salt with the hashed plaintext. A salt 'randomises' the output so that it is highly unlikely to produce the same output for the same input message twice (making it impossible for a 3rd party to detect repeated messages).