Given the following XML instance:
<entities>
<person><name>Jack</name></person>
<person><name></name></person>
<person></person>
</entities>
I am using the following code to: (a) iterate over the persons and (b) obtain the name of each person:
XPathExpression expr = xpath.compile("/entities/person");
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0 ; i < nodes.getLength() ; i++) {
Node node = nodes.item(i);
String innerXPath = "name/text()";
String name = xpath.compile(innerXPath).evaluate(node);
System.out.printf("%2d -> name is %s.\n", i, name);
}
The code above is unable to distinguish between the 2nd person case (empty string for name) and the 3rd person case (no name element at all) and simply prints:
0 -> name is Jack.
1 -> name is .
2 -> name is .
Is there a way to distinguish between these two cases using a different innerXPath
expression? In this SO question it seems that the XPath way would be to return an empty list, but I 've tried that too:
String innerXPath = "if (name) then name/text() else ()";
... and the output is still the same.
So, is there a way to distinguish between these two cases with a different innerXPath
expression? I have Saxon HE on my classpath so I can use XPath 2.0 features as well.
So the best I could do based on the accepted answer is the following:
XPathExpression expr = xpath.compile("/entities/person");
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0 ; i < nodes.getLength() ; i++) {
Node node = nodes.item(i);
String innerXPath = "name";
NodeList names = (NodeList) xpath.compile(innerXPath).evaluate(node, XPathConstants.NODESET);
String nameValue = null;
if (names.getLength()>1) throw new RuntimeException("impossible");
if (names.getLength()==1)
nameValue = names.item(0).getFirstChild()==null?"":names.item(0).getFirstChild().getNodeValue();
System.out.printf("%2d -> name is [%s]\n", i, nameValue);
}
The above code prints:
0 -> name is [Jack]
1 -> name is []
2 -> name is [null]
In my view this is not very satisfactory as logic is spread in both XPath and Java code and limits the usefulness of XPath as a host language and API-agnostic notation. My particular use case was to just keep a collection of XPaths in a property file and evaluate them at runtime in order to obtain the information I need without any ad-hoc extra handling. Apparently that's not possible.
The JAXP API, being based on XPath 1.0, is pretty limited here. My instinct would be to return the Name element (as a NodeList). So the XPath expression required is simply "Name". Then cases 1 and 2 will return a nodelist of length 1, while case 3 will return a nodelist of length 0. Cases 1 and 2 can then easily be distinguished within the application by getting the value of the node and testing whether it is zero-length.
Using /text() is always best avoided anyway, since it causes your query to be sensitive to the presence of comments in the XML.