I have a data.table
df in R with four columns: id, start, end, score. The start and end range from 1 to 10 and id might have duplicates. I would like to transform this data.table into a new matrix B where each row corresponds to an id, and has 10 columns corresponding to each position from 1 to 10. If a position is within the start and end in df for a particular id, the corresponding cell in B should be the score, otherwise it should be 0.
eg:
library(data.table)
df <- data.table(id = c(1, 1, 2), start = c(1, 5, 1), end = c(3, 7, 2), score = c(10, 20, 30))
df
id start end score
1 1 1 3 10
2 1 5 7 20
3 2 1 2 30
Expected output:
id 1 2 3 4 5 6 7 8 9 10
1 10 10 10 0 20 20 20 0 0 0
2 30 30 0 0 0 0 0 0 0 0
First extending the data's existing score by row, then adding the missing rows and finally transpose
.
library(data.table)
Range <- 1:10
df[,.(id, Seq = seq(start, end, 1), score), by = seq_along(id)][
, transpose(.(sapply(Range, \(x)
max(fifelse(x == Seq, score, 0))))), by = id]
id V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
<num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
1: 1 10 10 10 0 20 20 20 0 0 0
2: 2 30 30 0 0 0 0 0 0 0 0