rrtweet

Converting author_id to author_username with R


There is an "sourcetweet_author_id" column in my dataset (of around 30000 tweets) which includes the twitter id of quoted and retweeted users. I want to convert the twitter id to twitter user name.

I managed to gather user names of the "sourcetweet_author_id" with rtweet package's lookup_users function.

data.with.usernames <- lookup_users(as_userid(mydata$sourcetweet_author_id))

sample output:

sample data:

"user_id" "status_id" "created_at" "screen_name"
"99564663" "1521494990890876929" 2022-05-03 14:20:48 "LeventUzumcu"
"4274638635" "1521110034515701760" 2022-05-02 12:51:07 "SalihaSnmezate1"
"1266093027254325250" "1300887103874707457" 2020-09-01 20:03:49 "arjin3426"
"1494034783" "1521523729599107073" 2022-05-03 16:15:00 "DikenComTr"

But this function only returned the list of unique users. It is quite normal because my dataset includes many retweets from the same tweet.

Now, I need a function to match each sourcetweet_author_id with its user name and use that function to convert all the ids in "user_id" column to usernames in a new column.

sample data table of my original dataset:

"sourcetweet_author_id" "created_at" "retweet_count" "like_count"
"99564663" "2020-07-23T14:00:39.000Z" 8031 0
"99564663" "2020-07-23T14:00:35.000Z" 7153 0
"1266093027254325250" "2020-07-23T14:00:29.000Z" 7153 0
"1266093027254325250" "2020-07-23T14:00:29.000Z" 6596 0

Solution

  • This should add the screen_name column to original_dataset:

    library(dplyr)
    original_dataset %>%
      left_join(
        select(data.with.usernames, sourcetweet_author_id = user_id, screen_name)
      )