xmlgroovyxmlslurpergpath

Stop Combining Nested Values with XmlSlurper


I am trying to parse an XML document with XmlSlurper. This document has a lot of nested elements, and each one has a value. It follows this format:

<XML version="1" title="cars" lot="23">
<Cars>
    <Car year="2012" color="black" engine="2.0L" drivetrain="FWD">Hyundai Sonata<Condition id="5">Excellent<Running>Yes</Running></Condition></Car>
    <Car year="2007" color="silver" engine="2.4L" drivetrain="AWD">Audi A4<Condition id="4">Good<Running>Sometimes</Running></Condition></Car>
    <Car year="2009" color="gray" engine="2.0L" drivetrain="FWD">Mitsubishi Lancer<Condition id="3">Fair<Running>Yes</Running></Condition></Car>
    <Car year="1996" color="green" engine="5.0L" drivetrain="4WD">Jeep Grand Cherokee<Condition id="3">Fair<Running>No</Running></Condition></Car>
</Cars>

I am trying to print the year and make/model of each car, however, when I run my code, it prints the make/model, along with the values of Condition and Running, as such:

id: 2012
value: Hyundai SonataExcellentYes

id: 2007
value: Audi A4GoodSometimes

id: 2009
value: Mitsubishi LancerFairYes

id: 1996
value: Jeep Grand CherokeeFairNo

I'm wondering how I can isolate each of those values. Here is my code:

class ParseCars {
static void main(String[] args) {

    def carsXml = new XmlSlurper().parse("xml/cars.xml")

    carsXml.Cars.Car.each{
        def car = new Car()
        car.year = it.@year.text() as Integer
        car.makeModel = it

        println "id: ${car.year}"
        println "value: ${car.makeModel}"
        println " "
    }
}
}

I can't seem to find any documentation on dealing with nested values where the parent tags also contain values like this. Any help would be greatly appreciated.


Solution

  • NodeChild.text() will give you the text of all child nodes. You can use NodeChild.localText() to retrieve only the text of direct child nodes.

    So

    car.makeModel = it.localText()
    

    will do what you want.

    If you need to access children individually (e.g. the first child) you can need to go to the actual Node first. Then you can use children() to get a list of child nodes and access them individually:

    it[0].children()[0] // Text node, e.g. Hyundai Sonata
    it[0].children()[1] // Condition node, e.g. <Condition id="5">Excellent<Running>Yes</Running></Condition>