rtwittertweetstidytext

Extract different hashtags "#" from a text stored in a Dataframe with the R language


I have a data frame with some tweets and i want to extract the hashtags from the tweets using the unnest_tokens() function of tidytext package , creating a tokenized data frame with one row per hashtag.

My data only have 3 columns:

  1. Fecha: that is a the dates od the tweets in a POSIXct variable type.
  2. Usuario: that is the id user of the tweets in a Numeric varible type.
  3. Texto: that is the text the tweeets in a character variable type.

enter image description here

otros_numerales_numeral_petro  <- Numeral_Petro_sin_emojis %>% 
unnest_tokens(output = "hashtag", input = "Texto", token = "tweets") %>%
filter(str_starts(hashtag, "#"))

But, when i run the code i got this error:

Error: ! Support for token = "tweets" was deprecated in tidytext 0.4.0 and is now defunct.

Can someone help me to fix this, please.


Solution

  • Yep, the token = "tweets" option was deprecated at the end of last year because of changes in upstream dependencies. It sounds you don't want to tokenize the text really, but rather extract all the hashtags. I would do this:

    library(tidyverse)
    library(rtweet)
    bunny_tweets <- 
      search_tweets("#rabbits", n = 20, include_rts = FALSE) %>%
      filter(!possibly_sensitive, lang == "en")
    
    bunny_tweets %>%
      mutate(hashtags = str_extract_all(full_text, "#\\S+")) %>%
      unnest(hashtags) %>%
      select(id, hashtags, full_text)
    #> # A tibble: 142 × 3
    #>         id hashtags          full_text                                          
    #>      <dbl> <chr>             <chr>                                              
    #>  1 1.64e18 #Animate          "This awesome comic deserves more attention!\n \n#…
    #>  2 1.64e18 #Doujinshi        "This awesome comic deserves more attention!\n \n#…
    #>  3 1.64e18 #rabbits          "This awesome comic deserves more attention!\n \n#…
    #>  4 1.64e18 #april            "New baby bunny spotted! #april #rabbits\nBlack ba…
    #>  5 1.64e18 #rabbits          "New baby bunny spotted! #april #rabbits\nBlack ba…
    #>  6 1.64e18 #LFDIE            "Trust me! You'll get addicted to this story!\n \n…
    #>  7 1.64e18 #rabbits          "Trust me! You'll get addicted to this story!\n \n…
    #>  8 1.64e18 #huacheng         "Trust me! You'll get addicted to this story!\n \n…
    #>  9 1.64e18 #digitalanimation "I've been completely addicted to ONEPIECE and Mar…
    #> 10 1.64e18 #rabbits          "I've been completely addicted to ONEPIECE and Mar…
    #> # … with 132 more rows
    

    Created on 2023-04-01 with reprex v2.0.2