Has anyone dealt with using std::string functions for MBCS? For example in C I could do this:
p = _mbsrchr(path, '\\');
but in C++ I'm doing this:
found = path.find_last_of('\\');
If the trail byte is a slash then would find_last_of stop at the trail byte? Also same question for std::wstring.
If I need to replace all of one character with another, say all forward slashes with backslashes what would be the right way to do that? Would I have to check each character for a lead surrogate byte and then skip the trail? Right now I'm doing this for each wchar:
if( *i == L'/' )
*i = L'\\';
Thanks
Edit: As David correctly points out there is more to deal with when working with multibyte codepages. Microsoft says use _mbclen for working with byte indices and MBCS. It does not appear I can use find_last_of reliably when working with the ANSI codepages.
You don't need to do anything special about surrogate pairs. A single 16 bit character unit that is one half of a surrogate pair, cannot also be a non-surrogate character unit.
So,
if( *i == L'/' )
*i = L'\\';
is perfectly correct.
Equally you can use find_last_of
with wstring
.
It's more complicated for multi-byte ANSI codepages. You do need to deal with lead and trail byte issues. My recommendation is to normalise to a more reasonable encoding if you really have to deal with multi-byte ANSI date.