I asked participants to write three words and to assign a number to each word. Some words have been written by several participants, other words by only one participant. Now I'm trying to create a set of variables, one for each word mentioned by participants, containing the value that each participant has assigned to that word (if s/he wrote that word). I wrote a function to do this, but it does not give the expected output. I guess that this is due to the wrong interpretation, in the function, of my character vector allwords.
I created the following sample date to illustrate the issue.
data <- data.frame(
words1 = c("apple", "pear", "banana", "pear", "banana"),
words2 = c("pear", "banana", "pear", "banana", "cherry"),
words3 = c("banana", "ananas", "apple", "melon", "pear"),
value1 = c(2, 1, 2, 0, 1),
value2 = c(2, 0, 0, 2, 0),
value3 = c(0, 2, 2, 1, 1)
)
allwords <- c("apple", "pear", "banana", "ananas", "melon", "cherry")
attach(data)
head(data)
words1 words2 words3 value1 value2 value3
1 apple pear banana 2 2 0
2 pear banana ananas 1 0 2
3 banana pear apple 2 0 2
4 pear banana melon 0 2 1
5 banana cherry pear 1 0 1
I want to create a set of vectors, each one dedicated to one of the words in allwords, reporting the value that each participant assigned to that word (NA if no value assigned). This is the output I am trying to get:
apple pear banana ananas melon cherry
2 2 0 NA NA NA
NA 1 0 2 NA NA
2 0 2 NA NA NA
NA 0 2 NA 1 NA
NA 1 1 NA NA 0
I wrote this function to achieve this
value.f <- function(y){
values.w[[y]] <- NA
value.var <- values.w[[y]]
value.var[which(data$words1 == y)] <- data$value1
value.var[which(data$words2 == y)] <- data$value2
value.var[which(data$words3 == y)] <- data$value3
}
values.w <- list()
values.w <- lapply(allwords, value.f)
names(values.w) <- c(allwords)
But what I get is, for each word, the content of data$value3. Basically, all "which" conditions are found true, but I do not understand why.
as.data.frame(values.w)
apple pear banana ananas melon cherry
0 0 0 0 0 0
2 2 2 2 2 2
2 2 2 2 2 2
1 1 1 1 1 1
1 1 1 1 1 1
I do not understand what I am doing wrong, but I struggle a lot using character vectors in lapply()
functions so I guess that this is the kind of issue I have here.
I tried with eval(parse(text=y)
, I tried with paste0(y)
, but none of these work.
One possibility. It uses {dplyr} which has a function bind_rows
which can rowbind dataframes of varying variable composition without complaining that variables don't match (as rbind
would).
## words1 words2 words3 value1 value2 value3
## 1 apple pear banana 2 2 0
## 2 pear banana ananas 1 0 2
## 3 banana pear apple 2 0 2
## 4 pear banana melon 0 2 1
## 5 banana cherry pear 1 0 1
itemcount
items as (column) names and the second itemcount
items as values: row_to_named_list <- \(row, itemcount = 3){
setNames(row[1:itemcount + itemcount],
row[1:itemcount]
)
}
do.call
to call a function (bind_rows
) on a list (of single-row dataframes returned by lapply
ing above helper function on row indeces 1:5): library(dplyr)
do.call(dplyr::bind_rows,
lapply(1:5, \(r) row_to_named_list(data[r, ]))
)
output:
## apple pear banana ananas melon cherry
## 1 2 2 0 NA NA NA
## 2 NA 1 0 2 NA NA
## 3 2 0 2 NA NA NA
## 4 NA 0 2 NA 1 NA
## 5 NA 1 1 NA NA 0
edit
Below is an adapted version of your function which works as expected. Major glitches were failure to return the function's result and overwriting value.var
in the wrong positions, value3
overriding previous replacements.
value.f <- function(y){
## `<-` can't change objects outside the function from within
## the function anyway:
## values.w[[y]] <- NA
## this would create a single-item value.var containing only NA
## value.var <- values.w[[y]]
value.var <- rep(NA, 5) ## instantiate value.var as an NA-vector of desired length
## replacement values must have same length as replacement positions:
value.var[which(data$words1 == y)] <- data$value1[which(data$words1 == y)]
value.var[which(data$words2 == y)] <- data$value2[which(data$words2 == y)]
value.var[which(data$words3 == y)] <- data$value3[which(data$words3 == y)]
## don't forget to return value.var!
value.var
}
## values.w <- list() # lapply returns a list anyway
values.w <- lapply(allwords, value.f2)
names(values.w) <- c(allwords)
list2DF(values.w) ## make this a dataframe