javagroovyhttpbuilder

HTTP Builder/Groovy - get source text _and_ XmlSlurper output?


I am reading here: http://groovy.codehaus.org/modules/http-builder/doc/get.html

I seem to be able to get

i) XMLSlurper output as parsed by NekoHTML using:

def http = new HTTPBuilder('http://www.google.com') 
def html = http.get( path : '/search', query : [q:'Groovy'] )

ii) Raw text using:

http.get( path : '/search',
          contentType : TEXT,
          query : [q:'Groovy'] ) { resp, reader ->          
  println "response status: ${resp.statusLine}"
  println 'Headers: -----------'
  resp.headers.each { h ->
  println " ${h.name} : ${h.value}"
  }
  println 'Response data: -----'
  System.out << reader
  println '\n--------------------'
}

I am having some trouble and would like to get BOTH (i) and (ii) to debug my XmlSlurper code on the actual html I am getting.

Any suggestions how I might go about doing this?

I can easily instantiate an XmlSlurper object with the relevant string using the parseString(string) method or the parse(reader) method, but I cannot seem to get the Neko processing step correct.

Any hints?

Thank you! Misha


Solution

  • Ok here it is.

    Figured out from: http://groovy.codehaus.org/Testing+Web+Applications

    def html=http.get(uri:'http://www.google.com',contentType:groovyx.net.http.ContentType.TEXT) { resp,reader ->
      def s=reader.text
      new File("temp.html")<<s
      new XmlSlurper(new org.cyberneko.html.parsers.SAXParser()).parseText(s)          
    }
    

    Thank you! Misha