In DataFrames, I have 4 columns of type String. I want to concatenate all of their values with a space.
Currently, I'm doing this:
transform(df, All() => ((a,b,c,d) -> a .* " " .* b .* " " .* c .* " " .* d) => :combined_col)
Is there a more concise way of doing this without using .*
multiple times? Maybe using the join
function?
p.s., I'm using this inside a @chain
so I want the same style of syntax not using indexing.
UPDATE: this works but I have no idea why can someone explain?
transform(df, All() => ByRow((all...) -> join(all, " ")) => :combined)
Let me explain transform(df, All() => ByRow((all...) -> join(all, " ")) => :combined)
:
ByRow
to apply the function row-wise to your data frame.join
function accepts an iterator as its first argument, so all
must be an iterator (in your example, it is a tuple).All()
source passes the selected columns as consecutive positional arguments to the function. Therefore you need all...
to turn consecutive positional arguments into a tuple.Instead of all...
you could write:
transform(df, AsTable(All()) => ByRow(x -> join(x, " ")) => :combined)
The difference is that AsTable(All())
passes the selected columns as a single positional argument to the function (in a form of named tuple). Therefore you already have an iterable to pass to join
(since named tuple is iterable).
Going back to your original question how to use .*
to get the result the answer is:
transform(df, All() => ((x...) -> foldl((p, q) -> p .* " " .* q, x)) => :combined)
Note that you do not need ByRow
in this case as .*
already does broadcasting. You would need it if you used *
instead of .*
:
transform(df, All() => ByRow((x...) -> foldl((p, q) -> p * " " * q, x)) => :combined)