To apply ML algorithm on text, it has to be represented numerically. Some ways to do this using sklearn are:
CountVectorizer
CountVectorizer + TfidfTransformer
TfidfVectorizer
What is the difference between CountVectorizer+TfidfTransformer and TfidfVectorizer?
None, see the top of the documentation page:
sklearn.feature_extraction.text.TfidfVectorizer
...
Equivalent to CountVectorizer followed by TfidfTransformer.