Im trying to read a docx file into google collab since my main computer with anaconda is gone for maintenance. I'm trying to use the python-docx module, but to my knowlege I cant just pip install python-docx in google collab
'''
import docx
def getText(filename):
doc = docx.Document(filename)
fullText = []
for para in doc.paragraphs:
fullText.append(para.text)
return '\n'.join(fullText)
docxString = getText("week_8_document1.docx")
'''
any ideas?
try the following; hope it works:
#Install python-docx
!pip install python-docx #<-- Yes you can directly install in Colab
#Import the tools
import docx
from google.colab import files
uploaded = files.upload() #<-- Select the file you want to upload
file_name = '[whatever your file is called here].docx' #<-- Change filename to your file
doc = docx.Document(file_name)
Once you have the doc loaded, you can access texts by paragraphs or tables etc. Good luck boss