javaxmlnewlinecdatastrip

Strip whitespace and newlines from XML in Java


Using Java, I would like to take a document in the following format:

<tag1>
 <tag2>
    <![CDATA[  Some data ]]>
 </tag2>
</tag1>

and convert it to:

<tag1><tag2><![CDATA[  Some data ]]></tag2></tag1>

I tried the following, but it isn't giving me the result I am expecting:

DocumentBuilderFactory dbfac = DocumentBuilderFactory.newInstance();
dbfac.setIgnoringElementContentWhitespace(true);
DocumentBuilder docBuilder = dbfac.newDocumentBuilder();
Document doc = docBuilder.parse(new FileInputStream("/tmp/test.xml"));

Writer out = new StringWriter();
Transformer tf = TransformerFactory.newInstance().newTransformer();
tf.setOutputProperty(OutputKeys.INDENT, "no");
tf.transform(new DOMSource(doc), new StreamResult(out));
System.out.println(out.toString());

Solution

  • Working solution following instructions in the question's comments by @Luiggi Mendoza.

    public static String trim(String input) {
        BufferedReader reader = new BufferedReader(new StringReader(input));
        StringBuffer result = new StringBuffer();
        try {
            String line;
            while ( (line = reader.readLine() ) != null)
                result.append(line.trim());
            return result.toString();
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }