I'm fairly new to programming and was just wondering by why this code:
for ( ; *p; ++p) *p = tolower(*p);
works to lower a string case in c, when p points to a string?
                        
                            
                        
                        
                            On
                            
                            
                                                    
                    
                In general, this code:
for ( ; *p; ++p) *p = tolower(*p);
does not
” works to lower a string case in c, when p points to a string?
It does work for pure ASCII, but since char usually is a signed type, and since tolower requires a non-negative argument (except the special value EOF), the piece will in general have Undefined Behavior.
To avoid that, cast the argument to unsigned char, like this:
for ( ; *p; ++p) *p = tolower( (unsigned char)*p );
Now it can work for single-byte encodings like Latin-1, provided you have set the correct locale via setlocale, e.g. setlocale( LC_ALL, "" );. However, note that very common UTF-8 encoding is not a single byte per character. To deal with UTF-8 text you can convert it to a wide string and lowercase that.
Details:
*p is an expression that denotes the object that p points to, presumably a char.
As a continuation condition for the for loop, any non-zero char value that *p denotes, has the effect of logical True, while the zero char value at the end of the string has the effect of logical False, ending the loop.
++p advances the pointer to point to the next char.
To unpick, let's assume
pis a pointer to acharand just before theforloop, it points to the first character in a string.In C, strings are typically modelled by a set of contiguous
charvalues with a final 0 added at the end which acts as the null terminator.*pwill evaluate to 0 once the string null-terminator is reached. Then theforloop will exit. (The second expression in theforloop acts as the termination test).++padvances to the next character in the string.*p = tolower(*p)sets that character to lower case.