rdata-wranglingnetwork-analysis

R: how to combine the value of a column in two rows together, if these two rows share same character strings in another two columns


I am constructing an edge list for a network. I would like to combine the value of the third column together if the first two columns are the same. The data I have is like this.

ego    alter   weight
A      B       12
B      A       10
C      D       5
D      C       2
E      F       7
F      E       6

The dataset I expect is like this:

ego    alter   weight
A      B       22
C      D       7
E      F       13

Please enlighten me if you have some great ideas to achieve the expected result.


Solution

  • A base R option using pmin/pmax + aggregate

    aggregate(
        weight ~ .,
        transform(
            df,
            ego = pmin(ego,alter),
            alter = pmax(ego,alter)
        ),
        sum
    )
    

    gives

      ego alter weight
    1   A     B     22
    2   C     D      7
    3   E     F     13
    

    Or, we can use igraph

    library(igraph)
    
    df %>%
        graph_from_data_frame(directed = FALSE) %>%
        simplify() %>%
        get.data.frame()
    

    which gives

      from to weight
    1    A  B     22
    2    C  D      7
    3    E  F     13