I'm curious what the first input is which differentiates these two functions:
from math import *
def ilog2_ceil_alt(i: int) -> int:
# DO NOT USE THIS FUNCTION IT'S WRONG
return ceil(log2(i))
def ilog2_ceil(i: int) -> int:
# Correct
if i <= 0:
raise ValueError("math domain error")
return (i-1).bit_length()
...
Obviously, the first one is going to fail for some inputs due to rounding/truncation errors when cramming (the log of) an unlimited-sized integer through a finite double in the pipeline— however, I tried running this test code for a few minutes, and it didn't find a problem:
...
def test(i):
if ilog2_ceil(i) != ilog2_ceil_alt(i):
return i
def main(start=1):
import multiprocessing, itertools
p = multiprocessing.Pool()
it = p.imap_unordered(test, itertools.count(start), 100)
return next(i for i in it if i is not None)
if __name__ == '__main__':
i = main()
print("Failing case:", i)
I tried testing assorted large values like 2**32 and 2**64 + 9999, but it didn't fail on these.
What is the smallest (positive) integer for which the "alt" function fails?
A first issue with
ceil(log2(i))is that the integeriwill first be converted to the floating-point type toward 0 (this is the wrong direction!), here with a 53-bit precision. For instance, ifiis 253 + 1, it will be converted to 253, and you'll get 53 instead of 54.But another problem may occur even with smaller values: The values to consider are those that are slightly larger than a power of 2, say 2n + k with a small integer k > 0, because the log2 may round to the integer n (even if log2 is correctly rounded) while you would want a FP number slightly larger than n at this point. Thus this will give
ceil(n), i.e. n instead of n+1.Now, let's consider the worst case for k, i.e. k = 1. Let p denote the precision (on your example, p = 53 for the double precision, but let's generalize). One has log2(2n + 1) = n + log2(1 + 2−n) ≈ n + 0.43·2−n. If the representation of n needs exactly q bits, then the ulp will be 2q−p. To get the expected result, 0.43·2−n needs to be larger than 1/2 ulp, i.e. 2−n−2 ⩾ 2q−p−1, i.e. n ⩽ p−q−1.
Since q = ⌈log2(n)⌉, the condition is n + ⌈log2(n)⌉ ⩽ p − 1.
But since ⌈log2(n)⌉ is small compared to n, the maximum value of n will be of the order of p, so that one gets an approximate condition by replacing ⌈log2(n)⌉ by ⌈log2(p)⌉, i.e. n ⩽ p − ⌈log2(p)⌉ − 1.
Here's a C program using GNU MPFR to find the first value of n that fails, for each precision p:
With the argument 64, one gets:
EDIT: So, for double precision (p = 53), if log2 is correctly rounded (or at least, has a good accuracy), the smallest failing integer is 249 + 1.