I have a very large distance matrix (3678 x 3678) currently encoded as a data frame. Columns are named "1", "2", "3" and so on, the same for rows. So what I need to do is to find values <26 and different from 0 and to have the results in a second dataframe with two columns: the first one with index and the second one with the value. For example:
value
318-516 22.70601
...
where 318 is the row index and 516 is the column index.
Ok, I'm trying to recreate your situation (note: if you can, it's always helpful to include a few lines of your data with a dput
command).
You should be able to use filter
and some simple tidyverse commands (if you don't know how they work, run them line by line, always selecting commands up to the %>%
to check what they are doing):
library(tidyverse)
library(tidylog) # gives you additional output on what each command does
# Creating some data that looks similar
data <- matrix(rnorm(25,mean = 26),ncol=5)
data <- as_tibble(data)
data <- setNames(data,c(1:5))
data %>%
mutate(row = row_number()) %>%
pivot_longer(-row, names_to = "column",values_to = "values", names_prefix = "V") %>%
# depending on how your column names look like, you might need to use a separate() command first
filter(values > 0 & values < 26) %>%
# if you want you can create an index column as well
mutate(index = paste0(row,"-",column)) %>%
# then you can get rid of the row and column
select(-row,-column) %>%
# move index to the front
relocate(index)