htmlgroovyxmlslurpercyberneko

How to get html content using CyberNeko?


def page = new XmlSlurper(new SAXParser()).parse(url)
println  page.body[0]

I want output

 <body>
   <h1>Header</h1>
 </body>

where my html is:

   <html>
       <head>
           <title>Title</title>
       </head>
       <body>
             <h1>Header</h1>
       </body>
   </html>

But my output is

Header

How to tell xmlSluper to take the code, not the content?


Solution

  • To serialize data, you need to use some sort of serializer such as XmlUtil.serialize or StreamingMarkupBuilder, ie:

    println XmlUtil.serialize( page.body[0] )
    

    or:

    new groovy.xml.StreamingMarkupBuilder().bind { mkp.yield page.body }.toString()