node.jsxmlxml2js

Error in XML parsing with xml2js library when handling nested elements


I'm currently working on a Node.js script that involves parsing XML files using the xml2js library. The script reads an XML file, extracts specific elements, and builds a new XML output using xml2js.Builder().

The script works perfectly fine for simple XML structures. However, when I encountered XML files with nested elements, <customs> element, it throws an Invalid character in name error.

const fs = require('fs');
const xml2js = require('xml2js');
const XmlStream = require('xml-stream');

const stream = fs.createReadStream('./test.xml');
const xmlStream = new XmlStream(stream);
const builder = new xml2js.Builder();

const MAX_ITEMS = 1;
let lists = [];
let listCount = 0;

xmlStream.on('endElement: list', function (item) {
  if (listCount < MAX_ITEMS) {
    lists.push(item);
    listCount++;
  } else {
    stream.destroy();
  }
});

stream.on('close', function () { 
   const outputXml = builder.buildObject({ lists: { list: lists } });
   console.log(outputXml);
});

XML file:

<?xml version="1.0" encoding="UTF-8"?>
<lists>
  <list list-id="0001">
    <first-name>first-name-1</first-name>
    <last-name>second-name-1</last-name>
    <customs>
      <custom attribute-id="gender">female</custom>
    </customs>
  </list>
  <list list-id="0002">
    <first-name>first-name-2</first-name>
    <last-name>second-name-2</last-name>
    <customs>
      <custom attribute-id="gender">male</custom>
    </customs>
  </list>
</lists>

Why this error occurs when handling nested elements in xml2js and how to fix this?


Solution

  • The default text property for xml-stream is $text, while for xml2js builder is _, which is why it complains, so it happens with elements with attributes, not nested elements.

    So, try adding $text as charkey option:

    const builder = new xml2js.Builder({charkey: '$text'});