This is a small section of a dataset I'm working on.
dat2 <- read.table(text = "
nodepair V1 V2 V3 V4 V5 V6 V7 V8 V9 ES
1 A1_A1 0 21 0 0 0 0 0 0 78 45
2 A2_A1 0 0 0 0 0 0 0 0 99 45
3 A2_A2 0 1 0 0 0 0 0 0 98 45
4 A3_A1 0 0 0 0 0 6 1 3 89 45
5 A3_A2 0 0 0 0 0 0 0 0 99 45
6 A1_A1 0 20 0 0 0 0 0 0 65 46
7 A2_A1 0 0 0 0 0 0 0 0 85 46
8 A2_A2 0 1 0 0 0 0 0 0 84 46
9 A3_A1 0 0 0 0 2 6 3 3 71 46
10 A3_A2 0 0 0 0 0 0 0 0 85 46
11 A1_A1 0 25 0 0 0 0 0 0 45 47
12 A2_A1 0 0 0 0 0 0 0 0 70 47
13 A2_A2 0 1 0 0 0 0 0 0 69 47
14 A3_A1 0 0 0 0 0 8 0 1 61 47
15 A3_A2 0 0 0 0 0 0 0 0 70 47
16 A1_A1 0 37 0 0 0 0 0 0 77 48
17 A2_A1 0 0 0 0 0 0 0 0 114 48
18 A2_A2 0 0 0 0 0 0 0 0 114 48
19 A3_A1 0 0 0 0 2 9 0 3 100 48
20 A3_A2 0 0 0 0 0 0 0 0 114 48
", header = TRUE)
I'm trying to write a program that will do all pairwise comparisons (grouped by the nodepair) across the 'ES' groups.
I'd like to write a series of functions to specifically compare each pair of rows. For example, when V1:V9 is > 0 for both ESs, this should result in 1, indicating presence of data.
I'm imagining the output to look something like this:
dat3 <- read.table(text = "
nodepair1 nodepair2 V1 V2 V3 V4 V5 V6 V7 V8 V9
A1_A1(45) A1_A1(46) 0 0 1 0 0 0 0 0 1
", header = TRUE)
etc.
Unfortunately, I haven't gotten very far:
dat2 <- dat2 %>%
group_by(nodepair) %>%
col2 = t(combn(nodepair,2)))
I'm pretty sure I need 'combn' here, but I'm very new to this function and can't figure it out.
Now with the TO having clarified their question, I'd propose the following solution:
library(tidyverse)
ES_combs <- combn(unique(dat2$ES), 2, simplify = FALSE)
dat2 |>
group_split(nodepair) |>
map(.x = _,
.f = \(df) df |>
map(.x = 1:length(ES_combs),
.f = ~df |>
filter(ES %in% ES_combs[[.x]]) |>
summarize(nodepair = first(nodepair),
ES_1 = ES[1],
ES_2 = ES[2],
across(V1:V9, ~as.numeric(all(. >0)))))) |>
bind_rows()
which gives:
# A tibble: 30 × 12
nodepair ES_1 ES_2 V1 V2 V3 V4 V5 V6 V7 V8 V9
<chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A1_A1 45 46 0 1 0 0 0 0 0 0 1
2 A1_A1 45 47 0 1 0 0 0 0 0 0 1
3 A1_A1 45 48 0 1 0 0 0 0 0 0 1
4 A1_A1 46 47 0 1 0 0 0 0 0 0 1
5 A1_A1 46 48 0 1 0 0 0 0 0 0 1
6 A1_A1 47 48 0 1 0 0 0 0 0 0 1
7 A2_A1 45 46 0 0 0 0 0 0 0 0 1
8 A2_A1 45 47 0 0 0 0 0 0 0 0 1
9 A2_A1 45 48 0 0 0 0 0 0 0 0 1
10 A2_A1 46 47 0 0 0 0 0 0 0 0 1
11 A2_A1 46 48 0 0 0 0 0 0 0 0 1
12 A2_A1 47 48 0 0 0 0 0 0 0 0 1
13 A2_A2 45 46 0 1 0 0 0 0 0 0 1
14 A2_A2 45 47 0 1 0 0 0 0 0 0 1
15 A2_A2 45 48 0 0 0 0 0 0 0 0 1
16 A2_A2 46 47 0 1 0 0 0 0 0 0 1
17 A2_A2 46 48 0 0 0 0 0 0 0 0 1
18 A2_A2 47 48 0 0 0 0 0 0 0 0 1
19 A3_A1 45 46 0 0 0 0 0 1 1 1 1
20 A3_A1 45 47 0 0 0 0 0 1 0 1 1
21 A3_A1 45 48 0 0 0 0 0 1 0 1 1
22 A3_A1 46 47 0 0 0 0 0 1 0 1 1
23 A3_A1 46 48 0 0 0 0 1 1 0 1 1
24 A3_A1 47 48 0 0 0 0 0 1 0 1 1
25 A3_A2 45 46 0 0 0 0 0 0 0 0 1
26 A3_A2 45 47 0 0 0 0 0 0 0 0 1
27 A3_A2 45 48 0 0 0 0 0 0 0 0 1
28 A3_A2 46 47 0 0 0 0 0 0 0 0 1
29 A3_A2 46 48 0 0 0 0 0 0 0 0 1
30 A3_A2 47 48 0 0 0 0 0 0 0 0 1
This probably needs a bit of explanation:
ES_combs
map
where we go through each group's data frame. It is important here to define an anonymous function, because we have an inner map
, so we can't use the .x parameter twice.map
takes each combination pair from ES_combs
and filters the current group's data to these two rows. We then apply the summarize part.bind_rows
to merge everything into a nice tibble instead of having an annoyingly long list.