rstringr

str_extract_all returns a list but I want a vector


Still relatively new to R here. I have a column of tweets, and I'm trying to create a column that contains the retweet handle "RT @blahblah", like this:

Tweets                            Retweetfrom
RT @john I had a good day         RT @john
RT @josh I had a bad day          RT @josh

This is my code:

r$Retweetfrom <- str_extract_all(r$Tweets, "^RT[:space:]+@[:graph:]+")

It's giving me the result alright, but instead of a vector, the new column is a list. When I try to unlist it, it throws me an error:

Error in `$<-.data.frame`(`*tmp*`, "Retweetfrom", value = c("@AlpineITW", "@AllScienceGlobe",  : replacement has 1168 rows, data has 2306

Anyone know how to deal with this? Thanks a lot.


Solution

  • Assuming there's just one RT @user in each of row of the Tweets column (not a very strong assumption) then you may only want str_extract (which will vectorise over the strings) not str_extract_all (which may return multiple results per row). i.e.

    r$Retweetfrom <- str_extract(r$Tweets, "^RT[:space:]+@[:graph:]+")
    

    in which case you will get the first mention of RT @user, which is probably the one you want anyway.