I need to visualize the connections between developers in repo1 and repo2.
In particular, I need to show:
I've used networkd3 to create a simple network, but I struggle with the next steps.
library(networkD3)
df <- data.frame(devs = c("jeff", "jeff", "james", "james",
"mary", "alfred", "maggie"),repos=c("repo1", "repo2", "repo1", "repo2",
"repo1", "repo2", "repo1"), comm=c("3","3","6","6","3","3","3"))
simpleNetwork(df)
Desired output:
I am new to network graphs, so if this isn't doable with networkd3 I'm open to suggestions!
You can restructure your data and use networkD3::forceNetwork()
...
library(dplyr)
library(tidyr)
links <-
df %>%
mutate(node = paste0(devs, "_", repos)) %>%
pivot_wider(id_cols = c(devs, comm), names_from = repos, values_from = node) %>%
filter(!is.na(repo1) & !is.na(repo2)) %>%
mutate(value = 1L) %>%
select(source = repo1, target = repo2, value)
nodes <-
df %>%
mutate(id = paste0(devs, "_", repos)) %>%
mutate(node_size = as.numeric(comm) * 30) %>%
select(id, name = devs, group = repos, node_size)
links$source_id <- match(links$source, nodes$id) - 1L
links$target_id <- match(links$target, nodes$id) - 1L
forceNetwork(Links = links, Nodes = nodes, Source = "source_id",
Target = "target_id", Value = "value", NodeID = "id",
Nodesize = "node_size", Group = "group", opacity = 1L,
opacityNoHover = 1L, fontSize = 14L)
links
#> # A tibble: 2 × 5
#> source target value source_id target_id
#> <chr> <chr> <int> <int> <int>
#> 1 jeff_repo1 jeff_repo2 1 0 1
#> 2 james_repo1 james_repo2 1 2 3
source_id
and target_id
are the 0-indexed row/index of the node in the nodes
data frame
value
should be 1 unless you want to define different values for the weight of the link
nodes
#> # A tibble: 7 × 4
#> id name group node_size
#> <chr> <chr> <chr> <dbl>
#> 1 jeff_repo1 jeff repo1 90
#> 2 jeff_repo2 jeff repo2 90
#> 3 james_repo1 james repo1 180
#> 4 james_repo2 james repo2 180
#> 5 mary_repo1 mary repo1 90
#> 6 alfred_repo2 alfred repo2 90
#> 7 maggie_repo1 maggie repo1 90
id
or name
is the name of the node that will be displayed in the plot (there can be more than one node with the same name if you want)
group
is the group the node is in (these are arbitrary group names and they can all be the same or not)
node_size
gives the size of the node in the plot
The column names in your links
and nodes
data frames need to be explicitly specified in the forceNetwork()
function call e.g.
Source = "source_id"
Target = "target_id"
Value = "value"
NodeID = "id"
Nodesize = "node_size"
Group = "group"