javaxpathjdomjdom-2

Document declares separate empty namespace rendering NamespaceAware results useless


I'm trying to perform some retrieval queries on a "correct" pom.xml used by maven. For that I use basic XPath queries from JDOM.

Unfortunately the queries do not return any results (and neither do simple descendant filters). I'm reasonably sure that the problem lies within the root declaration of the pom.xml:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<!-- content -->
</project>

As can be seen an empty-namespace is defined that doesn't match either "" or "http://www.w3.org/2000/xmlns/", where "" is the default No-Namespace and the xmlns namespace is the default xmlns namespace.

So given a Document, when I want to perform an XPath-Query like follows:

XPathBuilder<Element> depQueryBuilder = new XPathBuilder<>("//dependencies/dependency", Filters.element());
XPathExpression<Element> depQuery = depQueryBuilder.compileWith(XPathFactory.instance());

for  (Element elem : depQuery.evaluate(document)) {
    // basically unreachable, since the resultset is always empty
}

Given the fact that all XPath expressions and queries are required to be namespace aware (compare the XPathHelper javadoc), I'm pretty sure that I can get this to work by adding the required namespace declarations.

I've tried the following with different kinds of failure:

depQueryBuilder.setNamespace("", document.getRootElement().getAttributeValue("xmlns"));
// NPE: Null URI
depQueryBuilder.setNamespace("", "http://maven.apache.org/POM/4.0.0");
// Cannot set a Namespace URI in XPath for "" prefix
depQueryBuilder.setNamespace(Namespace.NO_NAMESPACE);
// no error-message, but no results either
depQueryBuilder.setNamespace(document.getRootElement().getNamespace("xmlns"));
// NPE: Null Namespace
depQueryBuilder.setNamespace(document.getRootElement().getNamespace(""));
// Cannot set a Namespace URI in XPath for "" prefix
depQueryBuilder.setNamespace("xmlns", "http://maven.apache.org/POM/4.0.0");
// Name "xmlns" is not legal for JDOM/XML Namespace prefix

At this point I'm not even sure I'm attacking this at the right point. How can I get my XPath query to return results?

Note: The following more simple queries don't return results either:

document.getRootElement().getDescendants(Filter.element("dependency"));
// empty iterator
document.getRootElement().getChild("dependencies").getChildren("dependency"));
// NullPointerException because there is no child "dependencies"

Solution

  • XPath queries, and document navigation in general in XML, requires valid use of Namespaces.

    XPaths in particular don't follow the same rules as regular XML documents because there's a difference in the handling of the "" namespace. In XML documents, it refers to the "default" namespace (either the no-namespace or whatever has overridden that by being declared with xmlns="....").

    in XPath, however, the rules are slightly different (see the spec - emphasis mine):

    A QName in the node test is expanded into an expanded-name using the namespace declarations from the expression context. This is the same way expansion is done for element type names in start and end-tags except that the default namespace declared with xmlns is not used: if the QName does not have a prefix, then the namespace URI is null

    What this means practically, is that even though you don't need a namespace prefix in the XML for things like dependencies, you still need a prefix in the XPath query. The prefix can be anything - does not need to match anything in the XML document. So, choosing an arbitrary namespace like 'ns', you could have a query:

    XPathBuilder<Element> depQueryBuilder = new XPathBuilder<>("//ns:dependencies/ns:dependency", Filters.element());
    depQueryBuilder.setnamespace("ns", "http://maven.apache.org/POM/4.0.0");
    XPathExpression<Element> depQuery = depQueryBuilder.compileWith(XPathFactory.instance());
    

    This can be done simpler with:

    Namespace ns = Namespace.getNamespace("ns", "http://maven.apache.org/POM/4.0.0");
    XPathExpression<Element> depQuery = XPathFactory.instance()
      .compile("//ns:dependencies/ns:dependency", Filters.element(), null, ns);