Getting char * or const char * data from a string breaks for 16 character strings or longer

860 views Asked by At

I have a function string_to_char() which attempts to give me a form of a string which I can pass into a library I am using, which wants char * (but I think works with const char *, so I've been trying both).

The code I wrote to test my implementation of string_to_char() goes as such:

#include <iostream>

const std::string endl = "\n";

char * string_to_char(std::string str)
{
    return (char*) str.c_str();
}

int main()
{
    std::string test1 = "Some test strin";
    std::string test2 = "Some test string";

    char * result1 = string_to_char(test1);
    char * result2 = string_to_char(test2);

    std::cout << "part1" << endl;
    std::cout << result1 << endl;
    std::cout << string_to_char(test1) << endl;

    std::cout << "part2" << endl;
    std::cout << result2 << endl;
    std::cout << string_to_char(test2) << endl;

    std::cout << "done" << endl;

    return 0;
}

This is the output I get:

part1
Some test strin
Some test strin
part2

Some test string
done

So for some reason, string_to_char() only properly works with strings with 15 characters or shorter, and outputs from the function straight to std::cout, but can't seem to store it to a variable for 16 characters or longer.

I am relatively new to C++ so some of the code below may seem a bit strange to more experienced programmers, but here is the code that I have tried in place of return (char*) str.c_str();

#include <vector>
#include <string.h>

char * string_to_char(std::string str)
{
    return (char*) str.c_str();
    
    return const_cast<char*>(str.c_str());
    
    std::vector<char> vec(str.begin(), str.end());
    char * chr;
    vec.push_back('\0');
    chr = (char*) &vec[0];
    //chr = & (*vec.begin());
    return chr; //all outputs from both are empty with this both versions of chr
    
    return &str[0]; //this makes the output from the 15 character string also be empty when put in a
    //variable, but the function going directly to std::cout is fine

    return strcpy((char *) malloc(str.length() + 1), str.c_str()); //this one works with everything, but 
    //it looks like it leaks memory without further changes

    std::vector<char> copied(str.c_str(), str.c_str() + str.size() + 1);
    return copied.data(); //returns "random" characters/undefined behaviour for both outputs in test1 and is empty for both
    //outputs in test2
}

Using const instead, and changing char * result1 = string_to_char(test1); to const char * result1 = string_to_char(test1); (as with result2), to see if that works with these other solutions:

#include <vector>
#include <string.h>

const char * string_to_char(std::string str)
{
    return (char*) str.c_str();

    return str.c_str();

    return (const char*) str.c_str();

    return str.data();

    return const_cast<char*>(str.c_str()); 

    std::vector<char> vec(str.begin(), str.end());
    char * chr;
    vec.push_back('\0');
    chr = (char*) &vec[0];
    //chr = & (*vec.begin());
    return chr; //completely breaks both

    return &str[0]; //both appear empty when given to a variable, but works fine when taken straight to std::cout
 
    return strcpy((char *) malloc(str.length() + 1), str.c_str()); //memory leak, as when not using const

    std::vector<char> copied(str.c_str(), str.c_str() + str.size() + 1);
    return copied.data(); //same as when not using const
}

I got a lot of the given methods from:

std::string to char*

string.c_str() is const?

How to convert a std::string to const char* or char*?

Converting from std::string to char * in C++

With a bit of reading around the topic for strings and vectors at https://www.cplusplus.com/reference/ and https://en.cppreference.com/w/

3

There are 3 answers

3
Remy Lebeau On

string_to_char() is taking its str parameter by value, so a copy of the caller's input string is made. When the function exits, that copied std::string will be destroyed. Thus, the returned char* pointer will be left dangling, pointing to freed memory, and any use of that pointer to access the data will be undefined behavior.

Pass in the str parameter by reference instead:

char* string_to_char(std::string &str)
{
    return const_cast<char*>(str.c_str());
}

Or, in C++17 and later, you can use this instead:

char* string_to_char(std::string &str)
{
    return str.data();
}

Which then begs the question of why you need string_to_char() at all and don't just use data() directly, unless you are not using a modern version of C++.

4
463035818_is_not_an_ai On

The pointer returned from c_str() is only valid as long as the string is alive. You get expected output when you pass a reference:

auto string_to_char(std::string& str)
{
    return str.c_str();
}

Because now the pointer returned is into the buffer of the string of the caller. In your code the caller gets a pointer to the functions local string (because you pass a copy).

Though, instead of calling the function you can directly call c_str(). That also mitigates the problem of holding on to the pointer after the string is gone to some extend.

1
alter_igel On

You've overthought this. There is no need two write this function yourself. std::string::data already exists and returns a pointer to the string's null-terminated internal buffer. Assuming you're using C++17 or later, this pointer will be const char* if the std::string object is const-qualified (i.e. read-only), and otherwise will be a modifiable char*.

std::string test1 = "string";
const std::string test2 = "const string";

char* result1 = test1.data();
const char* result2 = test2.data();

This pointer is valid for as long as the std::string object that it came from is alive and is not modified (except for modifying individual elements).

Also note that casting pointers and casting away const-ness is a very easy way to cause Undefined Behaviour without knowing it. You should avoid C-style casts in general (e.g. (char*)str.c_str()) because they're very unsafe. See this Q/A on the proper use of C++ casts for more information.

Live Demo

Documentation