rubyxmlxpathrexml

How to get node text from child context using ruby, xpath, rexml


I'm having an issue getting REXML::XPath.first to render the correct node text from a child context.

See the test script and xml below.

test.rb

require 'rexml/document'
require 'rexml/xpath'

file = File.new('test.xml')
doc = REXML::Document.new(file)

employers = REXML::XPath.match(doc, '//EmployerOrg')
employers.each do |employer|
  # this looks fine, position_history is being set for each employer
  position_history = REXML::XPath.first(employer, 'PositionHistory')

  # always returns the title from the first employer, in spite of the position_history context
  p title = REXML::XPath.first(position_history, '//Title').text
end

Output:

"Director of Web Applications Development"
"Director of Web Applications Development"
"Director of Web Applications Development"

Example XML:

<?xml version="1.0" encoding="UTF-8"?>
<Resume xml:lang="en" xmlns="http://ns.hr-xml.org/2006-02-28" xmlns:sov="http://sovren.com/hr-xml/2006-02-28">
  <StructuredXMLResume>
    <EmploymentHistory>
      <EmployerOrg>
        <EmployerOrgName>Technical Difference</EmployerOrgName>
        <PositionHistory positionType="directHire" currentEmployer="true">
          <Title>Director of Web Applications Development</Title>
          <OrgName>
            <OrganizationName>Technical Difference</OrganizationName>
          </OrgName>
          <StartDate>
            <AnyDate>2004-10-01</AnyDate>
          </StartDate>
          <EndDate>
            <AnyDate>2015-09-15</AnyDate>
          </EndDate>
        </PositionHistory>
      </EmployerOrg>
      <EmployerOrg>
        <EmployerOrgName>Convergence Inc. LLC</EmployerOrgName>
        <PositionHistory positionType="directHire">
          <Title>Senior Web Developer/DBA</Title>
          <OrgName>
            <OrganizationName>Convergence Inc. LLC</OrganizationName>
          </OrgName>
          <StartDate>
            <AnyDate>2003-03-01</AnyDate>
          </StartDate>
          <EndDate>
            <AnyDate>2004-12-01</AnyDate>
          </EndDate>
          <UserArea>
            <sov:PositionHistoryUserArea>
              <sov:Id>POS-2</sov:Id>
              <sov:CompanyNameProbability>23</sov:CompanyNameProbability>
              <sov:PositionTitleProbability>30</sov:PositionTitleProbability>
            </sov:PositionHistoryUserArea>
          </UserArea>
        </PositionHistory>
      </EmployerOrg>
      <EmployerOrg>
        <EmployerOrgName>Avalon Digital Marketing Systems, Inc</EmployerOrgName>
        <PositionHistory positionType="contract">
          <Title>Contractor - Web Development</Title>
          <OrgName>
            <OrganizationName>Avalon Digital Marketing Systems, Inc</OrganizationName>
          </OrgName>
          <StartDate>
            <AnyDate>2002-05-01</AnyDate>
          </StartDate>
          <EndDate>
            <AnyDate>2003-03-01</AnyDate>
          </EndDate>
        </PositionHistory>
        <PositionHistory positionType="directHire">
          <Title>Web Developer/Junior DBA</Title>
          <OrgName>
            <OrganizationName>European Division</OrganizationName>
          </OrgName>
          <StartDate>
            <AnyDate>2000-05-01</AnyDate>
          </StartDate>
          <EndDate>
            <AnyDate>2002-04-30</AnyDate>
          </EndDate>
        </PositionHistory>
      </EmployerOrg>
    </EmploymentHistory>
  </StructuredXMLResume>
</Resume>

Solution

  • Probably because your XPath '//Title' is saying to start at the top of the document, pretty much ignoring the context-node position_history. Try replacing that with './Title' or just 'Title'.