I have a character variable containing codes discribing project characteristics. Looking like this:
[1] "151" "510|130|130" "311|110" "140" "160|160" "160|160|130"
[7] "160" "160" "160" "151" "151" "160|110"
I need to extract the main characteristic of the project, meaning the code that dominates. In case there is no dominating code I choose the first. resulting in:
[1] "151" "130" "311" "140" "160" "160"
[7] "160" "160" "160" "151" "151" "160"
Any suggestion on how to achieve this?
You can use strsplit
to split your vector and use collapse::fmode
to get the value that "dominate" (a so-called statistical mode), and the first value if there is a tie (which is the default behavior of fmode
):
x <- c("151", "510|130|130", "311|110")
as.numeric(sapply(strsplit(x, "\\|"), collapse::fmode))
#[1] 151 130 311
Other ways of making a mode
function, which is not directly implemented in base R, can be found here.