c++stringc-str

Curious behaviour of c_str() and strings when passed to class


I came across a curious behaviour (and I am sure it is just curious to me and there exists a perfectly valid c++ answer) when playing around with c-strings and std::string. Typically, when I pass a string to a class' constructor, I do something like this:

class Foo {
public:
  Foo(const std::string& bar) bar_(bar) { }
private:
  const std::string& bar_;
};

int main() {
  Foo("Baz");
  return 0;
}

which thus far has worked quite well and I have (perhaps naively?) never question this approach.

Then recently I wanted to implement a simple data containing class which, when stripped to its essential structure, looked like this:

#include <iostream>
#include <string>

class DataContainer {
public:
  DataContainer(const std::string& name, const std::string& description)
  : name_(name), description_(description) {}
  auto getName() const -> std::string { return name_; }
  auto getDescription() const -> std::string { return description_; }
private:
  const std::string& name_;
  const std::string& description_;
};

int main() {
    auto dataContainer = DataContainer{"parameterName", "parameterDescription"};
    auto name = dataContainer.getName();
    auto description = dataContainer.getDescription();

    std::cout << "name: " << name.c_str() << std::endl;
    std::cout << "description: " << description.c_str() << std::endl;
}

The output is:

name: parameterName
description:

I use *.c_str() here as this is how I use it my actual codebase (i.e. with google test and EXPECT_STREQ(s1, s2).

When I remove *.c_str() in the main function I get the following output:

name: parameterName
description: tion

So the original string of the description is truncated and the initial string is missing. I was able to fix this by changing the type within the class to:

private:
  const std::string name_;
  const std::string description_;

Now I get the expected output of

name: parameterName
description: parameterDescription

Which is fine, I can use this solution, but I would like to understand what is going on here. Also, if I change my main function slightly to

int main() {
    auto dataContainer = DataContainer{"parameterName", "parameterDescription"};
    auto name = dataContainer.getName().c_str();
    auto description = dataContainer.getDescription().c_str();

    std::cout << "name: " << name << std::endl;
    std::cout << "description: " << description << std::endl;
}

it doesn't matter how I store the string within the DataContainer class, i.e. by const ref or value. In both cases, I get

name: parameterName
description: 

along with a warning on clang:

<source>:19:17: warning: object backing the pointer will be destroyed at the end of the full-expression [-Wdangling-gsl]
    auto name = dataContainer.getName().c_str();

So I guess the issue is arising from *.c_str() itself? However, then I don't quite understand why I can't store the two strings name and description by const ref. Could anyone shed some light on the issue?


Solution

  • As already noted, the problems in the posted code rise from dangling references to temporary objects, either stored as class members or returned and accessed by .c_str().

    The first fix is to store actual std::strings as members, not (dangling) references and then write accessor functions returning const references to those:

    #include <iostream>
    #include <string>
    
    class DataContainer {
    public:
      DataContainer(std::string name, std::string description)
        : name_(std::move(name)), description_(std::move(description)) {}
      auto getName() const -> std::string const& { return name_; }
      auto getDescription() const ->  std::string const& { return description_; }
    private:
      const std::string name_;
      const std::string description_;
    };
    
    int main() {
        auto dataContainer = DataContainer{"parameterName", "parameterDescription"};
        
        std::cout << "name: " << dataContainer.getName().c_str() << std::endl;
        std::cout << "description: " << dataContainer.getDescription().c_str() << std::endl;
        return 0;
    }
    

    You can see here that the output is as expected (even when using intermediate local variables).


    I use *.c_str() here as this is how I use it my actual codebase

    Then consider adding a couple of accessors returning exactly that:

    //...
    auto Name() const { return name_.c_str(); }
    auto Description() const { return description_.c_str(); }
    //...
    std::cout << "name: " << dataContainer.Name() << std::endl;
    std::cout << "description: " << dataContainer.Description() << std::endl;