javaxml

Good way to parse XML like this sample?


I am wondering what's the best practice to parse XML like this:

<root>
    <MailNotification enable="true">
        <To>foo@bar.org</To>
        <From>foo@bar.org</From>
        <Server>smtp.bar.org</Server>
        <Port>465</Port>
        <Username>foo@bar.org</Username>
        <Password>fooo!</Password>
    </MailNotification>
</root>

I am using Java 7, the complete XML is longer, but it's not a big file. I thought about using a Stax Pull Parser because it seemed easy, but there's one thing where I am not sure if it is really a good way:

When coming to a MailNotification element, I could e.g. create a new instance of e.g. a mail class, I have no problem with that. But, what if I come e.g. to an To element? How do I know if it is really inside a MailNotification element and not directly below the root? In other words: What I am missing is a best practice for handling states like "now I am in a MailNotification" element.

Note: I know I could verify the XML first, but imagine it would be allowed to have a To element inside a MailNotification element and a To element as children of another, semantically different element - same problem: I somehow need to keep track of states / context to make sure I interpret the To element correctly.


Solution

  • StAX Stream Reader are the best* choice. Just use the Java stack to keep your state, like in this example. The constants are XMLStreamConstants.

    XMLStreamReader reader;
    
    void parseRoot() {
        reader.require(START_ELEMENT, null, "root");
    
        while (reader.nextTag() == START_ELEMENT) {
            switch (reader.getLocalName()) {
            case "MailNotification":
                MailNotification mail = parseMail();
                // do something with mail
                break;
            // more cases
            }
        }
    
        reader.require(END_ELEMENT, null, "root");
    }
    
    MailNotification parseMail() {
        reader.require(START_ELEMENT, null, "MailNotification");
        MailNotification mail = new MailNotification();
    
        while (reader.nextTag() == START_ELEMENT) {
            switch (reader.getLocalName()) {
            case "To":
                mail.setTo(parseString());
                break;
            // more cases
            }
        }
    
        reader.require(END_ELEMENT, null, "MailNotification");
        return mail;
    }
    
    String parseString() {
        String text = "";
        if (reader.next() == CHARACTERS) {
            text = reader.getText();
            reader.next();
        }
        return text;
    }
    

    (*) Just to clarify on the "best choice", it depends on what you want to do.
    JAXB is very good if your XML directly maps to the objects you want to create.
    JDOM is useful if you want to navigate you XML in complex ways, eg, if you implement something like XPath; but for simple parsing its overkill. This is the approach that consumes most memory.
    SAX was the lightest and most efficient parser before StAX was around.