C printf specifier with variadic arguments. At which point is the undefined behavior problematic?

270 views Asked by At

https://godbolt.org/z/qZVO3a

This is a minimal reproduction of the warnings I see. Obviously UB can be bad, but I think while many of the below situations are okay, there's some really nasty uses and I need to determine which require corrective action.

#include <stdarg.h>
#include <stdio.h>
#include <limits.h>

typedef struct _thing {

    char  first[4];
    char  second[10];
    char  last[111];
}THING;


void custom_printf(char* _format, ...) __attribute__((format(printf, 1,2)));
void custom_printf(char* _format, ...) 
{
    // get buffer from some source
    char buffer[1024];
    va_list ap;
    va_start(ap, _format);
    vsnprintf(buffer, 1024, _format, ap);
    va_end(ap);
    // use buffer for some purpose

}

int main(){

    custom_printf("HI THERE%d");
    custom_printf("HI THERE", 1);
    custom_printf("val: %d", (void*)0);
    custom_printf("val: %p", 0);
    custom_printf("val: %lld", 1);
    custom_printf("val: %s", (THING){"A", "AA", "CCCC"});
    custom_printf("val: %0.30s","HI");
    custom_printf("val: %d",LLONG_MAX);
}

The warnings see include:


<source>: In function 'main':

<source>:26:5: warning: format '%d' expects a matching 'int' argument [-Wformat]

<source>:27:5: warning: too many arguments for format [-Wformat-extra-args]

<source>:28:5: warning: format '%d' expects argument of type 'int', but argument 2 has type 'void *' [-Wformat]

<source>:29:5: warning: format '%p' expects argument of type 'void *', but argument 2 has type 'int' [-Wformat]

<source>:30:5: warning: format '%lld' expects argument of type 'long long int', but argument 2 has type 'int' [-Wformat]

<source>:31:5: warning: format '%s' expects argument of type 'char *', but argument 2 has type 'THING' [-Wformat]

<source>:32:5: warning: '0' flag used with '%s' gnu_printf format [-Wformat]

<source>:33:5: warning: format '%d' expects argument of type 'int', but argument 2 has type 'long long int' [-Wformat]

<source>:34:1: warning: control reaches end of non-void function [-Wreturn-type]

Compiler returned: 0

It's my understanding that the above has many flavors of UB here. After looking around I've seen that I should just fix the above. Now I want to eventually fix them all, but for now my curiousity is making me wonder which is the worst scenario. I'd assume that cases like the first where I'm not passing in enough items.

It's my understanding that in the above I have:

  1. Popping off stack that doesn't exist
  2. Not popping enough off the stack
  3. Padding a string with leading zeros
  4. Casting integer to pointer
  5. Casting a struct that can be cased to

Out of the above I'm fairly certain that anything that pops off the stack that doesn't exist will lead to the worst scenario. But I'm also wondering what the other severe cases are.

1

There are 1 answers

6
chux - Reinstate Monica On

At which point is the undefined behavior problematic?

All UB is problematic.

Identifying a particular compiler version's UB effects has some merit in problem solving. Yet one should never rely on that UB effect to persist.

My answer is based on C in general, not on gcc 4.7.


Consider that objects are not necessarily passed using the same mechanism across types. Related true example: float/double passed in a FP stack and other types via the usual stack. printf("%llx\n", 1.234); can fail badly, even though the size passed is 8 and 8 is expected, yet they are in different places. A similar difference could occurs between pointer types and integers (although that sounds like a unicorn platform).


Leaving UB in code in inefficient in development.
Consider if one did find some UB that worked great in a select case, the next compilation or version may render different results. By fixing, you save time not trying to explained how "this UB is OK, I know I tested it" during a code review. Also save time not needing to find a way to quiet the warning of this one "good" UB. The programming team that has to maintain your UB code will mutter evil things about the prior coder.


UB Missing matching argument.

custom_printf("HI THERE%d");
<source>:26:5: warning: format '%d' expects a matching 'int' argument [-Wformat]

Not UB. Extra args are OK, yet likely is a coding mis-step - hence the warning. @melpomene

custom_printf("HI THERE", 1);
<source>:27:5: warning: too many arguments for format [-Wformat-extra-args]

UB. intand void * may different size, legal values and function passing mechanisms,

custom_printf("val: %d", (void*)0);
<source>:28:5: warning: format '%d' expects argument of type 'int', but argument 2 has type 'void *' [-Wformat]

UB. same as line 28

custom_printf("val: %p", 0);
<source>:29:5: warning: format '%p' expects argument of type 'void *', but argument 2 has type 'int' [-Wformat]

UB. intand long long may different size and function passing mechanisms,

custom_printf("val: %lld", 1);
<source>:30:5: warning: format '%lld' expects argument of type 'long long int', but argument 2 has type 'int' [-Wformat]

UB. Types may different in size, legal values and function passing mechanisms,

custom_printf("val: %s", (THING){"A", "AA", "CCCC"});
<source>:31:5: warning: format '%s' expects argument of type 'char *', but argument 2 has type 'THING' [-Wformat]

UB: Invalid standard specifier %0.30s, anything may happen. Well behaved on select systems that define behavior for this non-standard specifier.

custom_printf("val: %0.30s","HI");
<source>:32:5: warning: '0' flag used with '%s' gnu_printf format [-Wformat]

UB like line 30

custom_printf("val: %d",LLONG_MAX);
<source>:33:5: warning: format '%d' expects argument of type 'int', but argument 2 has type 'long long int' [-Wformat]

Not UB with main(). Only a UB problem with functions in general if calling code use the return value. Yet main() is special in that code acts as if a return 0; was at the end - if that function does not end with a return.

<source>:34:1: warning: control reaches end of non-void function [-Wreturn-type]