Background
I recently found that md5 hashes on large R objects using the digest package did not change when making small changes. This appears to be due to some 32 bit counter variables getting overflowed and the algorithm missing the changed portion of the file.
Using the current development version of digest on Linux, hashes notice these small changes on large files whereas on Windows, these small changes get missed.
I made the following changes to the current dev version, which swaps a few unsigned long int (unit32) variables for unsigned long long int (uint64) variables:
https://github.com/eddelbuettel/digest/compare/master...kendonB:testmd5
and now on Windows the problem is fixed and the hashes notice the changes.
Question
Is swapping out these 32-bit integer variables for 64-bit integer variables benign? Will anything get ruined on 32-bit systems? On obscure systems? Can anything go wrong?
On a 32-bit system, a 64-bit integer is usually implemented using two 32-bit registers. Operations on such an integer result in two instructions for load and store. For something like addition, add with carry is used. This is something the compiler takes care of.
You should only make sure that the compiler you are using supports such a type.
For example, the signed and unsigned versions of
long long int(which should be at least 64 bits) were introduced in C99. So you should use a compiler has support for this feature of the C99 standard.