rubyrexml

Why the original element got changed when I modify the copy created by .dup method? Ruby and REXML


I just tested the following steps in irb and got something odd:

require 'rubygems'
require 'rexml/document'  
include REXML

e1=Element.new("E1")
e2=Element.new("E2")
e1.add_element(e2)

e1Dup=e1.dup
puts e1
puts e1Dup

e1.delete_element(e1.elements[1])
puts e1
puts e1Dup

I only want changes on e1, however, the result shows both element got changed. How could this happen? The result is below:

<E1><E2/></E1>
<E1><E2/></E1>
<E1/>
<E1/>

Solution

  • Ruby's dup function only makes a shallow duplicate of the object. Its internal data, such as that stored in attributes (which is what the elements method is accessing) does not get duplicated. So you have two totally distinct objects e1 and e1Dup, but their children are the same. In C++, you would say that the two pointers are accessing the same memory location. To fully duplicate the tree of elements, you would have to recursively call .dup on each child node and replace them on the duplicated e1Dup.