dataframejulia

Trouble transforming column in Julia DataFrame based on condition on rows


I am wondering what I am doing wrong with this conditional transform in Julia's DataFrames. I tried going with the examples here but somehow could not translate the instructions to write a function that transforms the rows it needs to.

My code:

function cleaner(name,cc)
    if name == "svietnam" 
        "DRV" 
    elseif name == "vietnam" 
        "RVN"  
    else
        Missing
    end

end
formal_long_vietnamcleaned = transform(formal_long, [:name, :country_code] => cleaner => :cleaned_cc )

Somehow, the string comparison does not work. When I tried it to check whether the strings exist in the table with

formal_long_vietnamcleaned[formal_long_vietnamcleaned.name .== "svietnam",:] I definitely got matches. Somehow, I do not understand how to write a useful function for transform. I would appreciate your guys help!


Solution

  • You want ByRow(cleaner) since without, name and cc will be the entire columns (vectors), so what's happening is something like ["svietnam"] == "svietnam" which is obviously false. ByRow applies the function row-wise so that the function gets the individual elements of each column.

    Also, missing should be lowercased.