I get this error when I try to decompress wikipedia dump to use its .xml file. How can I solve it?
filepath='/Data/nlp/ESA/Wiki-ESA-master'
file_name='enwiki-latest-pages-articles.xml.bz2'
zipfile = bz2.BZ2File(file_name) # open the file
DEFAULT_FILENAME = zipfile.read() # get the decompressed data
error:
EOFError: compressed file ended before the logical end-of-stream was detected
As the error says, the downloading process most likely ended prematurely and you have a truncated file. Try downloading again.
Another reason may be a corrupted data on your disk. Downloading again may help with this too.