[SOLVED] Move subgroup under repeated main group while keeping main group once in data.frame

Move subgroup under repeated main group while keeping main group once in data.frame

I'm aware that the question is awkward. If I could phrase it better I'd probably find the solution in an other thread.

I have this data structure:

df <- data.frame(group = c("X", "F", "F", "F", "F", "C", "C"),
                 subgroup = c(NA, "camel", "horse", "dog", "cat", "orange", "banana"))

and would like to turn it into this:

data.frame(group = c("X", "F", "camel", "horse", "dog", "cat", "C", "orange", "banana"))

which is surprisingly confusing. Also, I would prefer not using a loop.

I updated the example to clarify that solutions that depend on sorting unfortunately do not do the trick.

Solution

Here an (edited) answer with new data. Using data.table is going to help a lot. The idea is to split the df into groups and lapply() to each group what we need. Whe have to take care of some things meanwhile.

library(data.table)
# set as data.table
setDT(df)

# to mantain the ordering, you need to put as factor the group.
# the levels are going to give the ordering infos to split
df[,':='(group = factor(group, levels =unique(df$group)))]

# here the split function, splitting df int a list
df_list <-split(df, df$group, sorted =F)

# now you lapply to each element what you need
df_list <-lapply(df_list, function(x) data.frame(group = unique(c(as.character(x$group),x$subgroup))))

# put into a data.table and remove NAs
rbindlist(df_list)[!is.na(df_onecol$group)]

    group
1:      X
2:      F
3:  camel
4:  horse
5:    dog
6:    cat
7:      C
8: orange
9: banana