pythonscikit-learnnltktf-idftfidfvectorizer

what is the difference between tfidf vectorizer and tfidf transformer


I know that the formula for tfidf vectorizer is

Count of word/Total count * log(Number of documents / no.of documents where word is present)

I saw there's tfidf transformer in the scikit learn and I just wanted to difference between them. I could't find anything that's helpful.


Solution

  • TfidfVectorizer is used on raw documents, while TfidfTransformer is used on an existing count matrix, such as one returned by CountVectorizer