I think I'm missing something trivial but I'm losing a bunch of time on this, so its solution may be useful to others too:
I'm working with libxml2 2.9.8 (pure C, not C++ bindings) under linux.
I have an external (non-libxml) tree structure representing an XML file and I'm trying to write into a string representation using libxml2. All is trivial and working nice traversing it and writing using xmlTextWriter
API (it is a struct with simple attributes, like
typedef struct _simplifiedNode {
char *tag,
char *content,
struct _simplifiedNode *parent,
struct _simplifiedNodeList *children,
} simplifiedNode;
), except at a certain point I encounter a string node that may contain the string representation of an xml document. I can parse it using the xmlReadMemory
API, but then I need to nest it (and not its escaped string representation) into the on-going writer, including namespaces and attributes.
Is there a trivial way I am missing to do this recursively having the parsed doc/root element, without introspecting every sub-element?
e.g.
I'm producing the following document using xmlTextWriter
API
<Title>
TitleValue
</Title>
<Date>
2018-11-26
</Date>
<Content>
The Content node in the non-libxml tree is a leaf node with tag Content
containing a string like
char *content = "<SomeXmlComplexDocument ss:someattr=\"attrval\">Somecontent</SomeXmlComplexDocument>"
What I Want to achieve is, instead of having something like
<Content><SomeXmlComplexDocument> ... </Content>
after having parsed and validated the content with xmlReadMemory
to re-inject the document obtaining
<Content>
<SomeXmlComplexDocument ss:someattr="attrval">Somecontent</SomeXmlComplexDocument>
</Content>
namespaces and attributes should be preserved.
To serialize the inner XML fragments unescaped, you can simply use xmlTextWriterWriteRaw
. This won't check whether the XML is well-formed, though. If you need validation, you'll have to parse the XML fragments at some point. Depending on the content model, you might have to use xmlParseBalancedChunkMemory
instead of xmlReadMemory
. It should also be possible to parse the result document in one go after it was written, but you'll lose information like original line numbers.