javaxmlxpathjdomjavarosa

Select a node using xpath and jdom


I have an xform document

<?xml version="1.0" encoding="UTF-8"?><h:html xmlns:h="http://www.w3.org/1999/xhtml" xmlns="http://www.w3.org/1999/xhtml"
xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:jr="http://openrosa.org/javarosa">
<h:head>
    <h:title>Summary</h:title>
    <model>
        <instance>
            <data vaultType="nsp_inspection.4.1">
                <metadata vaultType="metadata.1.1">
                    <form_start_time type="dateTime" />
                    <form_end_time type="dateTime" />
                    <device_id type="string" />
                    <username type="string" />
                </metadata>
                <date type="date" />
                <monitor type="string" />
            </data>
        </instance>
    </model>
</h:head>

I would like to select the data element from the xform using xpath and jdom

XPath xpath = XPath.newInstance("h:html/h:head/h:title/");

seems to work fine and selects the title element but

XPath xpath = XPath.newInstance("h:html/h:head/model");

does not select the model element. I guess it has something to do with the namespace.


Solution

  • A few things. You really should be using JDOM 2.0.x ... (2.0.5 is latest release). The XPath API in the 2.0.x versions is far better than the one in JDOM 1.x: see https://github.com/hunterhacker/jdom/wiki/JDOM2-Feature-XPath-Upgrade

    @wds is right about not having the correct namespace for the xforms elements too.... and that is why you XPath is working, because it has the same namespace as the xhtml elements with the 'h' prefix. Your code is likely to be broken still.

    Namespaces in XPaths often confuse people, because every namespace in an XPath has to have a prefix. Even if something is the default namespace in the XML (no prefix like your 'model' element), it has to have one in the XPath. queries with no prefix in the XPath always reference the 'no namespace' namespace.... (XPath specification: http://www.w3.org/TR/xpath/#node-tests )

    A QName in the node test is expanded into an expanded-name using the namespace declarations from the expression context. This is the same way expansion is done for element type names in start and end-tags except that the default namespace declared with xmlns is not used: if the QName does not have a prefix, then the namespace URI is null (this is the same way attribute names are expanded). It is an error if the QName has a prefix for which there is no namespace declaration in the expression context

    Assuming @wds is correct, and the namespace for the model element is supposed to be "http://www.w3.org/2002/xforms" then your namespace delcaration in your document should be xmlns="http://www.w3.org/2002/xforms". But, this namespace is the 'default' namespace, and the URI for the no-prefix namespace in your XPath query is "".

    To access the http://www.w3.org/2002/xforms namespace in your XPath you have to give it a prefix fo the context of the XPath, let's say xpns (for xpath namespace). In JDOM 1.x you add that namespace with:

    XPath xpath = XPath.newInstance("/h:html/h:head/xpns:model");
    xpath.addNamespace(Namespace.getNamespace("xpns", "http://www.w3.org/2002/xforms");
    Element model = (Element)xpath.selectSingleNode(mydoc)
    

    Note how that adds the xpns to the query. Also, note that I have 'anchored' the h:/html reference to the '/' root of the document, which will improve the performance of the query evaluation.

    IN JDOM 2.x, the XPath API is significanty better (even though in some cases it may seem overkill).

    XPathFactory xpf = XPathFactory.instance();
    XPathExpression<Element> xpath = xpf.compile("/h:html/h:head/xpns:model",
                  Filters.element(), null,
                  Namespace.getNamesace("xpns", "http://www.w3.org/2002/xforms"));
    Element model = xpath.evaluateFirst(mydoc);
    

    See more about the new XPath API in the JDOM 2.x javadoc: XPathFactory.compile(...) javadoc