Hi I have a dataframe where I want to replace a particular value such as 99
with a missing
value. Its difficult to do this as you cannot change the column value types. One way to do this would be
df = DataFrame(
:x1 => [1,2,99],
:x2 => [10,99,11],
:x3 => [20,21,22]
)
ammended_cols = replace.(eachcol(df), 99 => missing)
df_n = map((x,y) -> x => y, names(df),ammended_cols) |> DataFrame
julia> @show df_n
df_n = 3×3 DataFrame
Row │ x1 x2 x3
│ Int64? Int64? Int64?
─────┼──────────────────────────
1 │ 1 10 20
2 │ 2 missing 21
3 │ missing 11 22
However I am wondering if there is easier/better way to do this such as using replace!
to mutate the existing dataframe. However doing this would result in a type conversion error
replace!.(eachcol(df), 1.23 => missing)
ERROR: MethodError: Cannot `convert` an object of type Missing to an object of type Int64
U can use the allowmissing
function (official docs) which changes the type of each column in the DataFrame
to allow missing values i.e. the type of each column will change from T
to Union{T, Missing}
, where T
is the original type of the column, and Union{T, Missing}
means that it could have either the T
type or the special missing
type.
df = DataFrame(
:x1 => [1,2,99],
:x2 => [10,99,11],
:x3 => [20,21,22]
)
# Converting df to allow missing data
allowmissing!(df)
replace!.(eachcol(df), 99 => missing)
The exclamation mark !
in allowmissing!(df)
means that it directly changes the df
, instead of creating a new DataFrame
. If you want to create a new DataFrame
, the code could be:
df = DataFrame(
:x1 => [1,2,99],
:x2 => [10,99,11],
:x3 => [20,21,22]
)
df_n = allowmissing(df)
df_n = DataFrame(replace.(eachcol(df_n), 99 => missing), names(df_n))