What I am trying to do is basically pulling out keywords from a processed file of a log file and creating a vectorized dataframe of those keywords. But when I am writing that dataframe into CSV, words are in the columns and their respective value in the second row.
While I want the words to be in rows and their value in second column.
trial.py :
import re
import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import ENGLISH_STOP_WORDS
def removeNumbers(list):
#doing something
def processFiles(filename):
#doing something
def readFile(fileName):
#doing something
# Build our text
text = readFile("processedFile.txt")
vectorizer = CountVectorizer()
matrix = vectorizer.fit_transform([text])
counts = pd.DataFrame(matrix.toarray(),
keywords_count.csv looks like this :
Transpose your dataframe: