rdata-miningtext-miningcpu-wordsentencecase

Sentence to Word Table with R


I have some sentences, from the sentences I want to separate the words to get row vector each. But the words are repeating to match with the largest sentence's row vector that I do not want. I want no matter how large the sentence is, the row vector of each of the sentences will only be the words one time.

sentence <- c("case sweden", "meeting minutes ht board meeting st march now also attachment added agenda today s board meeting", "draft meeting minutes board meeting final meeting minutes ht board meeting rd april")
sentence <- cbind(sentence)
word_table <- do.call(rbind, strsplit(as.character(sentence), " "))
test <- cbind(sentence, word_table)

This is what I get now, enter image description here

And this is what I want, enter image description here

I mean no-repeating.


Solution

  • The Solution from rawr,

    sentence <- c("case sweden", "meeting minutes ht board meeting st march now also attachment added agenda today s board meeting", "draft meeting minutes board meeting final meeting minutes ht board meeting rd april")
    dd <- read.table(text = paste(sentence, collapse = '\n'), fill = TRUE)
    test <- cbind(sentence, dd)
    

    Or,

    cc <- read.table(text = paste(gsub('\n', '', sentence), collapse = '\n'), fill = TRUE)
    test1 <- cbind(sentence, cc)
    

    Thanks.