rubyxpathrexml

Save save all elements under a parent that matches xpath description


say i have a document like this:

<div class='thing'>
    <td class='A'>Hey</td>
    <span class='B'>test</span>
    <td class='C'>asd</td> 
</div>
<div class='thing'>
    <td class='A'>yoyo</td>
    <span class='B'>lol</span>
    <td class='C'>aaaaaaaaaaaa</td>
</div>

And i want to save all the text in classes A and B in the document (Hey,test,yoyo,lol) in say a hash like this:

{ {"thing1", ["Hey","Test"]}, {"thing2", ["yoyo","lol"]} }

What do i do? (im using REXML and Xpath in rub

When i for example do this:

doc = Document.new(xmlfile)
parent  = "//div[@class='thing']"
A   = "//td[@class='A']"
B   = "//span[@class='B']"

XPath.each(doc, parent) do |thing|
  XPath.each(thing, A + "|" + B) do |children|
    puts children.text
  end
end

(This is just a test, i want to replace the print with add to hash)

It prints every element that matches A and B in the whole document, for every element with class="thing". So the output is:

Hey
test
yoyo
lol
Hey
test
yoyo
lol

What i want to is for every class='thing' print its children matching A and B:

Hey
test
yoyo
lol

Solution

  • This is a classic XPath mistake. / at the beginning of an XPath expression always references root document. If you meant to do a relative XPath query and need to start the expression with /, you'll need to explicitly make the expression heed the context element by 'prepending' a . :

    ....
    A   = ".//td[@class='A']"
    B   = ".//span[@class='B']"
    ....