I'm attempting to parse XML in the following format (from the European Central Bank data feed) using libxml-ruby:
<?xml version="1.0" encoding="UTF-8"?>
<gesmes:Envelope xmlns:gesmes="http://www.gesmes.org/xml/2002-08-01"
xmlns="http://www.ecb.int/vocabulary/2002-08-01/eurofxref">
<gesmes:subject>Reference rates</gesmes:subject>
<gesmes:Sender>
<gesmes:name>European Central Bank</gesmes:name>
</gesmes:Sender>
<Cube>
<Cube time="2009-11-03">
<Cube currency="USD" rate="1.4658"/>
<Cube currency="JPY" rate="132.25"/>
<Cube currency="BGN" rate="1.9558"/>
</Cube>
</Cube>
</gesmes:Envelope>
I'm loading the document as follows:
require 'rubygems'
require 'xml/libxml'
doc = XML::Document.file('eurofxref-hist.xml')
But I'm struggling to come up with the correct namespace configuration to allow XPATH queries on the data.
I can extract all the Cube
nodes using the following code:
doc.find("//*[local-name()='Cube']")
But given that both the parent node and child nodes are both called Cube
this really doesn't help me iterate over just the parent nodes. Perhaps I could modify this XPATH to only find those nodes with a time
parameter?
My aim is to be able to extract all the Cube
nodes which have a time
attribute (i.e. <Cube time="2009-11-03">
) so I can then extract the date and iterate over the exchange rates in the child Cube
nodes.
Can anyone help?
either of these will work:
/gesmes:Envelope/Cube/Cube - direct path from root
//Cube[@time] - all cube nodes (at any level) with a time attribute
Ok, this is tested and working
arrNS = ["xmlns:http://www.ecb.int/vocabulary/2002-08-01/eurofxref", "gesmes:http://www.gesmes.org/xml/2002-08-01"]
doc.find("//xmlns:Cube[@time]", arrNS)