encodingxerces-c

Does xerces-c have a default encoding and can it be modified?


Does xerces-c decode all characters to a default encoding? And if so can this default encoding be user-specified.

While parsing a UTF-8 encoded XML the chars argument of the callback

DefaultHandler::characters( const XMLCh *const chars, const XMLSize_t length )

is no longer in UTF-8. For example the pound symbol, 0xC2 0xA3 in UTF-8 appears as 0x00 0xA3. This leads me to conclude that xerces-c is decoding the string whereas I'd like it not to. I would like to handle the decoding myself.


Solution

  • Found it. The encoding can be set by InputSource::setEncoding(const XMLCh* const encodingStr)