rubystringvariable-assignmentobject-identity

Why do some Ruby methods like String#replace mutate copies of variables?


So first off I'm just learning Ruby and coming from a JavaScript background. I have a problem that I can't find an answer to. I have this example:

a = 'red'
b = a
b.replace('blue')
b = 'green'
print a

blue

My question is: why is this a thing? I understand that setting b = a makes them them the same object_id, so there technically two names for the same variable string. But I don't ever see a reason to use this sort of recursive value change. If I'm setting b = a it's because I want to manipulate the value of a without changing it.

Furthermore, it seems sometimes a method will modify a, but sometimes it will cause "b" to become a new object. This seems ambiguous and makes no sense.

When will I ever use this? What is the point? Does this mean I can't pass the value of a into another variable without any changes propagating back to a?


Solution

  • The issue here is not called recursion, and Ruby variables are not recursive (for any normal meaning of the word - i.e. they don't reference themselves, and you don't need recursive routines in order to work with them). Recursion in computer programming is when code calls itself, directly or indirectly, such as a function that contains a call to itself.

    In Ruby, all variables point to objects. This is without exception - although there are some internal tricks to make things fast, even writing a=5 creates a variable called a and "points" it to the Fixnum object 5 - careful language design means you almost don't notice this happening. Most importantly, numbers cannot change (you cannot change a 5 into a 6, they are always different objects), so you can think that somehow a "contains" a 5 and get away with it even though technically a points to 5.

    With Strings though, the objects can be changed. A step-by-step explanation of your example code might read like this:

    a = 'red'
    

    Creates a new String object with the contents "red", and points variable a at it.

    b = a
    

    Points variable b to same object as a.

    b.replace('blue')
    

    Calls the replace method on the object pointed to by b (and also pointed to by a) The method alters the contents of the String to "blue".

    b = 'green'; 
    

    Creates a new String object with the contents "green", and points variable b at it. The variables a and b now point to different objects.

    print a 
    

    The String object pointed to by a has contents "blue". So it is all working correctly, according to the language spec.

    When will I ever use this?

    All the time. In Ruby you use variables to point, temporarily, to objects, in order to call methods on them. The objects are the things you want to work with, the variables are the names in your code you use to reference them. The fact that they are separate can trip you up from time to time (especially in Ruby with Strings, where many other languages do not have this behaviour)

    and does this mean I can't pass the value of "a" into another variable without any changes recursing back to "a"?

    If you want to copy a String, there are a few ways to do it. E.g.

    b = a.clone
    

    or

    b = "#{a}"
    

    However, in practice you rarely just want to make direct copies of strings. You will want to do something else that is related to the goal of your code. Usually in Ruby, there will be a method that does the manipulation that you need and return a new String, so you would do something like this

    b = a.something
    

    In other cases, you actually will want changes to be made to the original object. It all depends on what the purpose of your code is. In-place changes to String objects can be useful, so Ruby supports them.

    Furthermore it seems sometimes a method will recurse into "a" and sometimes it will cause "b" to become a new object_id.

    This is never the case. No methods will change an object's identity. However, most methods will return a new object. Some methods will change an object's contents - it is those methods in Ruby that you need to be more aware of, due to possibility of changing data being used elsewhere - same is true in other OO languages, JavaScript objects are no exception here, they behave in the exact same way.