Does xerces-c decode all characters to a default encoding? And if so can this default encoding be user-specified.
While parsing a UTF-8 encoded XML the chars
argument of the callback
DefaultHandler::characters( const XMLCh *const chars, const XMLSize_t length )
is no longer in UTF-8. For example the pound symbol, 0xC2 0xA3
in UTF-8 appears as 0x00 0xA3
. This leads me to conclude that xerces-c is decoding the string whereas I'd like it not to. I would like to handle the decoding myself.
Found it. The encoding can be set by InputSource::setEncoding(const XMLCh* const encodingStr)