rjanitor

keep duplicates using `make_clean_names` in R janitor package


I am trying to clean a character column using make_clean_names function in janitor package in R. I need to keep the duplicated in this case and not add a numeric to it. Is this possible? My code is like this

x <- c(' x y z', 'xyz', 'x123x', 'xy()','xyz','xyz')

janitor::make_clean_names(x)
[1] "x_y_z" "xyz"   "x123x" "xy"    "xyz_2" "xyz_3"

janitor::make_clean_names(x, unique_sep = '.')
[1] "x_y_z" "xyz"   "x123x" "xy"    "xyz.1" "xyz.2"

janitor::make_clean_names(x, unique_sep = NULL)
[1] "x_y_z" "xyz"   "x123x" "xy"    "xyz_2" "xyz_3"

Using unique_sep = NULL doesn't seem to work. Any other way to keep unique values?

Desired Output:

[1] "x_y_z" "xyz"   "x123x" "xy"    "xyz" "xyz"

I know how to use regular expressions to do this. Just searching for a shortcut.

PS: I know this function is created to clean names of a data.frame, I am trying to apply this to a different use case. This functionality might help a lot in cleaning character columns.


Solution

  • Update: As of janitor 2.2.0, this is now possible with allow_dupes = TRUE:

    x <- c(' x y z', 'xyz', 'x123x', 'xy()','xyz','xyz')
    janitor::make_clean_names(x, allow_dupes = TRUE)
    
    [1] "x_y_z" "xyz"   "x123x" "xy"    "xyz"   "xyz"  
    

    I overrode my original, obsolete answer with this new one.