Context: I'm parsing an XML file using the libxml-ruby gem. I need to query the XML document for a set of nodes using the XPath find
method. I then need to process each node individually, querying them once again using the XPath find
method.
Issue: When I attempt to query the returned nodes individually, the XPath find
method is querying the entire document rather than just the node:
Code Example:
require 'xml'
string = %{<?xml version="1.0" encoding="iso-8859-1"?>
<bookstore>
<book>
<title lang="eng">Harry Potter</title>
<price>29.99</price>
</book>
<book>
<title lang="eng">Learning XML</title>
<price>39.95</price>
</book>
</bookstore>}
xml = XML::Parser.string(string, :encoding => XML::Encoding::ISO_8859_1).parse
books = xml.find("//book")
books.each do |book|
price = book.find("//price").first.content
puts price
end
This script returns 29.99
twice. I think this must have something to with setting the XPath context but I have not figured out how to accomplish that yet.
The first problem I see is book.find("//price")
.
//price
means "start at the top of the document and look downward. That's most certainly NOT what you want to do. Instead I think you want to look inside book
for the first price
.
Using Nokogiri, I'd use CSS selectors because they're more easy on the eyes and can usually accomplish the same thing:
require 'nokogiri'
string = %{<?xml version="1.0" encoding="iso-8859-1"?>
<bookstore>
<book>
<title lang="eng">Harry Potter</title>
<price>29.99</price>
</book>
<book>
<title lang="eng">Learning XML</title>
<price>39.95</price>
</book>
</bookstore>}
xml = Nokogiri::XML(string)
books = xml.search("book")
books.each do |book|
price = book.at("price").content
puts price
end
After running that I get:
29.99
39.95