rubylibxml-ruby

LIBXML-RUBY > Xpath context


Context: I'm parsing an XML file using the libxml-ruby gem. I need to query the XML document for a set of nodes using the XPath find method. I then need to process each node individually, querying them once again using the XPath find method.

Issue: When I attempt to query the returned nodes individually, the XPath find method is querying the entire document rather than just the node:

Code Example:

require 'xml'

string = %{<?xml version="1.0" encoding="iso-8859-1"?>
<bookstore>
  <book>
    <title lang="eng">Harry Potter</title>
    <price>29.99</price>
  </book>
  <book>
    <title lang="eng">Learning XML</title>
    <price>39.95</price>
  </book>
</bookstore>}

xml = XML::Parser.string(string, :encoding => XML::Encoding::ISO_8859_1).parse
books = xml.find("//book")
books.each do |book|
    price = book.find("//price").first.content
    puts price
end

This script returns 29.99 twice. I think this must have something to with setting the XPath context but I have not figured out how to accomplish that yet.


Solution

  • The first problem I see is book.find("//price").

    //price means "start at the top of the document and look downward. That's most certainly NOT what you want to do. Instead I think you want to look inside book for the first price.

    Using Nokogiri, I'd use CSS selectors because they're more easy on the eyes and can usually accomplish the same thing:

    require 'nokogiri'
    
    string = %{<?xml version="1.0" encoding="iso-8859-1"?>
    <bookstore>
      <book>
        <title lang="eng">Harry Potter</title>
        <price>29.99</price>
      </book>
      <book>
        <title lang="eng">Learning XML</title>
        <price>39.95</price>
      </book>
    </bookstore>}
    
    xml = Nokogiri::XML(string)
    books = xml.search("book")
    books.each do |book|
        price = book.at("price").content
        puts price
    end
    

    After running that I get:

    29.99
    39.95