GCC complier :strange behavior when doing float operation , float value saturating to 65536 where float is of 4 bytes

Question

GCC complier :strange behavior when doing float operation , float value saturating to 65536 where float is of 4 bytes

86 views Asked by Sam At 28 October 2021 at 07:16

[I tried to compute a float multiplication, I observed the value was getting saturated to 65536 and was not updating.

the issue is only with the below code.]1

Result for the above code

I tried this with online GCC compiler the issue was still the same.

does this have anything to do with float precision ? is compiler optimizing my float precision during operation?

is there any compiler flags that I can add to overcome this issue?

can anyone please guide me on how to solve this issue?

Attaching the code for reference

#include <stdio.h>

int main()
{
    float dummy1, dummy2;
 unsigned int i =0;
    
    printf("Hello World");
    printf("size of float = %ld\n", sizeof(dummy1));
    
    dummy2 = 0.0;
    dummy1 =65535.5;
    
     dummy2 = 60.00 * 0.00005;
    
    for( i= 0; i< 300; i++)
    {
        dummy1 = dummy1 + dummy2;
        printf("dummy1 = %f   %f\n", dummy1, dummy2);
    }

    return 0;
};

Original Q&A

There are 1 answers

**Eric Postpischil** · Answer 1 · 2021-10-28T11:44:31+00:00

(This answers presumes IEEE-754 single and double precision binary formats are used for float and double.)

60.00 * 0.00005 is computed with double arithmetic and produces 0.003000000000000000062450045135165055398829281330108642578125. When this is stored in dummy2, it is converted to 0.0030000000260770320892333984375.

In the loop, dummy1 eventually reaches the value 65535.99609375. Then, when dummy1 and dummy2 are added, the result computed with real-number arithmetic would be 65535.9990000000260770320892333984375. This value is not representable in the float format, so it is rounded to the nearest value representable in the float format, and that is the result that the + operator produces.

The nearest representable values in the float format are 65535.99609375 and 65536. Since 65536 is closer to 65535.9990000000260770320892333984375, it is the result.

In the next iteration, 65536 and 0.0030000000260770320892333984375 are added. The real-arithmetic result would be 65536.0030000000260770320892333984375. This is also not representable in float. The nearest representable values are 65536 and 65536.0078125. Again 65536 is closer, so it is the computed result.

From then on, the loop always produces 65536 as a result.

You can get better results either by using double arithmetic or by computing dummy1 afresh in each iteration instead of accumulating rounding errors from iteration to iteration:

for (i = 0; i < 300; ++i)
{
    dummy1 = 65535.5 + i * 60. * .00005;
    printf("%.99g\n", dummy1);
}

Note that because dummy1 is a float, it does not have the precision required to distinguish some successive values of the sequence. For example, output of the above includes:

65535.9921875
65535.99609375
65535.99609375
65536
65536.0078125
65536.0078125
65536.0078125
65536.015625
65536.015625
65536.015625

TechQA.

GCC complier :strange behavior when doing float operation , float value saturating to 65536 where float is of 4 bytes

There are 1 answers

Related Questions in GCC

Related Questions in FLOATING-POINT

Related Questions in PRECISION

Related Questions in MULTIPLICATION

Related Questions in SATURATION-ARITHMETIC

Popular Questions

Trending Questions