Constant function pointer optimization

90 views Asked by At

I am trying to implement an abstract interface in C using function pointers inside a struct.
Something like the following

typedef int (*fn_t)(int);
typedef struct
{
    int x;
    const fn_t fnp;
}struct_t;

__attribute__((optimize("O0"))) int square(int num) 
{
    return num * num;
}

static struct_t test = {.fnp = square};

int main(void)
{
    test.x = 1;

    int fnp_ret = test.fnp(3);

    return (fnp_ret);
}

When building in godbolt with -O3 using ARM-GCC-13.2.0 unknown-eabi the output is the following.

square:
        str     fp, [sp, #-4]!
        add     fp, sp, #0
        sub     sp, sp, #12
        str     r0, [fp, #-8]
        ldr     r3, [fp, #-8]
        mov     r2, r3
        mul     r2, r3, r2
        mov     r3, r2
        mov     r0, r3
        add     sp, fp, #0
        ldr     fp, [sp], #4
        bx      lr
main:
        mov     r1, #1
        ldr     r3, .L5
        mov     r0, #3
        ldr     r2, [r3, #4]
        str     r1, [r3]
        bx      r2
.L5:
        .word   .LANCHOR0

Here one can see that in main() the assembly emitted, first locates the function pointer in the struct and then de-references it. I find this strange since the function pointer is const so I expected that the compiler should figure out that it always points to the square function so it would be equivalent to calling the square function directly. Apparently this is not the case here.

During experiment I noticed that in case the statement test.x = 1; is commented out, the assembly does what I expected, by calling the square function directly

square:
        str     fp, [sp, #-4]!
        add     fp, sp, #0
        sub     sp, sp, #12
        str     r0, [fp, #-8]
        ldr     r3, [fp, #-8]
        mov     r2, r3
        mul     r2, r3, r2
        mov     r3, r2
        mov     r0, r3
        add     sp, fp, #0
        ldr     fp, [sp], #4
        bx      lr
main:
        mov     r0, #3
        b       square

What am I missing?
Is there any way to implement this reliably without paying the performance hit described above?

1

There are 1 answers

2
gulpr On BEST ANSWER
  1. optimize O0 is not the right one. You want noinline
  2. It is well known gcc optimizer flaw. If you touch any member of the struct it considers the whole struct as non const
__attribute__((noinline)) int square(int num) 
{
    return num * num;
}

What am I missing? Is there any way to implement this reliably without paying the performance hit described above?

You can't do anything about it, I am afraid. Most likely it will never be sorted. if it matters to you you can use clang: https://godbolt.org/z/T4bznYE4h