I have two df to confrontate. my first df is "sum"
> head(sum)
File_pdb Res1 Chain1 Res2 Chain2
1: 7LD1_CM GLN 81 M ASN 501 C
2: 7LD1_CM TYR 128 M PHE 377 C
3: 7LD1_CM ILE 78 M SER 375 C
4: 7LD1_CM ASN 76 M ALA 372 C
5: 7LD1_CM THR 20 M TYR 369 C
6: 7LD1_CM ARG 408 C LEU 131 M
The second one is "mut"
> head(mut)
RefAA MutAA LineagesCount
1 VAL 3 GLY 3 1
2 LEU 5 PHE 5 2
3 LEU 8 VAL 8 1
4 SER 13 ILE 13 2
5 LEU 18 PHE 18 5
6 THR 20 ILE 20 1
I have to check if in sum$res1 and sum$res2 there are values equal to mut$refAA. If it's so, I need to add the whole row of mut$refAA near to sum$res1 or sum$res2.
here an example:
File_pdb Res1 Chain1 Res2 Chain2 RefAA MutAA LineagesCount
1: 7LD1_CM GLN 81 M ASN 501 C
2: 7LD1_CM TYR 128 M PHE 377 C
3: 7LD1_CM ILE 78 M SER 375 C
4: 7LD1_CM ASN 76 M ALA 372 C
5: 7LD1_CM THR 20 M TYR 369 C THR 20 ILE 20 1
6: 7LD1_CM ARG 408 C LEU 131 M
How I can do this? I was trying something using merge and join functions but I'm not so experienced so I need to practice more. Can someone help me? Thank you!
I had to fix the data a bit, to easily import the data. Then you can try a tidyverse
library(tidyverse)
SUM %>%
mutate(index = 1:n()) %>%
pivot_longer(c(Res1, Res2)) %>%
left_join(mutate(MUT, value=RefAA), by = "value") %>%
group_by(index) %>%
fill(MutAA, RefAA, LineagesCount, .direction = "downup") %>%
ungroup() %>%
pivot_wider(names_from = name, values_from = value, values_fn = toString) %>%
mutate(which_Res = ifelse(RefAA == Res1, "Res1", "Res2"))
# A tibble: 6 x 10
File_pdb Chain1 Chain2 index RefAA MutAA LineagesCount Res1 Res2 which_Res
<chr> <chr> <chr> <int> <chr> <chr> <int> <chr> <chr> <chr>
1 7LD1_CM M C 1 NA NA NA GLN81 ASN501 NA
2 7LD1_CM M C 2 NA NA NA TYR128 PHE377 NA
3 7LD1_CM M C 3 NA NA NA ILE78 SER375 NA
4 7LD1_CM M C 4 NA NA NA ASN76 ALA372 NA
5 7LD1_CM M C 5 THR20 ILE20 1 THR20 TYR369 Res1
6 7LD1_CM C M 6 NA NA NA ARG408 LEU131 NA
The data
SUM <- read.table(text = " File_pdb Res1 Chain1 Res2 Chain2
1: 7LD1_CM GLN81 M ASN501 C
2: 7LD1_CM TYR128 M PHE377 C
3: 7LD1_CM ILE78 M SER375 C
4: 7LD1_CM ASN76 M ALA372 C
5: 7LD1_CM THR20 M TYR369 C
6: 7LD1_CM ARG408 C LEU131 M")
SUM
MUT <- read.table(text = " RefAA MutAA LineagesCount
1 VAL3 GLY3 1
2 LEU5 PHE5 2
3 LEU8 VAL8 1
4 SER13 ILE13 2
5 LEU18 PHE18 5
6 THR20 ILE20 1")