juliapipelinedataframes.jl

【Julia: DataFrame】 How to combine two lines into one using `|>`


Consider the following:

df = DataFrame(A=[-1,missing,1],B=[10,20,30])

3×2 DataFrame
 Row │ A        B
     │ Int64?   Int64
─────┼────────────────
   1 │      -1     10
   2 │ missing     20
   3 │       1     30

cond = df.A .> 0.                      #1
cond = replace(cond, missing => false) #2
df[cond,:]

Is it possible to combine #1 and #2 into one line using |> like

cond = df.A .> 0 |> replace(missing => false)

which fails.


Solution

  • Pipes in Base support only functions with one argument. When combining the pipe with an anonymous function it will work, where (df.A .> 0) needs to be in ( and )

    using DataFrames
    df = DataFrame(A=[-1,missing,1],B=[10,20,30])
    
    cond = (df.A .> 0) |> x -> replace(x, missing => false)
    cond
    #3-element Vector{Bool}:
    # 0
    # 0
    # 1
    
    # Or using coalesce.
    cond = (df.A .> 0) |> x -> coalesce.(x, false)
    

    Another possibility will be to use the package Pipe.jl, where @pipe needs to be used and the placeholder is _.

    using Pipe: @pipe
    cond = @pipe (df.A .> 0) |> replace(_, missing => false)
    

    Beside Pipe.jl this works also with Hose.jl

    using Hose
    cond = @hose (df.A .> 0) |> replace(_, missing => false)
    
    #or
    cond = @hose (df.A .> 0) |> replace(missing => false)
    

    or with Plumber.jl

    using Plumber
    cond = @pipe (df.A .> 0) |> replace(_, missing => false)
    
    #or
    @pipe cond = (df.A .> 0) |> replace(_, missing => false)
    

    or with Chain.jl

    using Chain
    cond = @chain df.A .> 0 replace(_, missing => false)
    
    #or
    cond = @chain df.A .> 0 replace(missing => false)
    

    or with Lazy.jl

    using Lazy
    cond = @> df.A .> 0 replace(missing => false)
    
    #or
    cond = @as x df.A .> 0 replace(x, missing => false)