c++builderbinary-dataansistring

How to use AnsiString to store binary data?


I have a simple question.

I want to use AnsiString as a container for binary data. I mostly load such data from TMemoryStream or TFileStream and I save it back from AnsiString after some processing. Works fine, haven't found a problem with that.

But from what I've seen using it like that sparcles debates to use Sysutils::TBytes instead. Why? Sysutils::TBytes has much fewer useful methods which I can use to manipulate data stored inside for example AnsiString. It is clearly half-finished container, compared to AnsiString.

Is the only problem I should care about conversion to regular string or is there something else why I should really use the less-than-adequate TBytes instead? I do not make conversions of AnsiString to other string types - that is what is quoted as a possible problem elsewhere.

An example of how I load data:

AnsiString data;
boost::scoped_ptr<TFileStream> fs(new TFileStream(FileName, fmOpenRead | fmShareDenyWrite));
data.SetLength(fs->Size);
fs->Read(data.c_str(), fs->Size);

An example how I save data:

// fs wants void * so I have to use data.data() instead of data.c_str() here
fs->Write(data.data(), data.Length());

So it should be safe to store binary data correct?


Solution

  • I want to use AnsiString as a container for binary data.

    One word - DON'T! It will bite you someday. Use a more appropriate container, such as TBytes, TMemoryStream, std::vector<byte>, etc.

    Works fine, haven't found a problem with that.

    Consider yourself lucky. From C++Builder 2009 onwards, AnsiString is codepage-aware, and it WILL cause data conversions if you are not VERY careful when passing AnsiString around. Sooner or later, you are likely to slip up and it will risk corrupting your binary data.

    But from what I've seen using it like that sparcles debates to use Sysutils::TBytes instead. Why?

    Because it is an actual raw binary container meant specifically for raw bytes.

    Sysutils::TBytes has much fewer useful methods which I can use to manipulate data stored inside for example AnsiString.

    You should not be manipulating binary data as text to begin with. And since you are using things like Boost and STL, you should consider using their binary containers instead. They have more functions available.

    That being said, XE7 does introduce some new functions for manipulating Delphi-style dynamic arrays (like TBytes) including inserts, deletes, and concatenations:

    String-Like Operations Supported on Dynamic Arrays

    It does not look like those new functions made it into C++Builder's DynamicArray class (which TBytes is a typedef of), though.

    It is clearly half-finished container, compared to AnsiString.

    AnsiString is a container of text characters. Period. Always has been, always will be. People ABUSE it by taking advantage of the fact that sizeof(char)==sizeof(byte). That worked up to a point, but it has become dangerous in recent years to continue abusing it.

    Is the only problem I should care about conversion to regular string or is there something else why I should really use the less-than-adequate TBytes instead?

    That, and the fact that Embarcadero has been phasing out AnsiString since 2009. 8bit strings are disabled in the mobile compilers, it is only a matter of time before the desktop compilers follow suit.

    Why are you wanting to manipulate raw bytes as strings to begin with? Can you provide an example of something you can do with AnsiString that you cannot do with TBytes?

    So it should be safe to store binary data correct?

    In your specific example, yes (and yes, you can use c_str() instead of data() when calling fs->Write()).