[SOLVED] Python. How to convert MOBI file to a text (or EPUB file)

Python. How to convert MOBI file to a text (or EPUB file)

I have problems with converting MOBI file to a text in Python.

I found this library - https://github.com/iscc/mobi that should convert MOBI to EPUB and then I found ebooklib library that is working very well with converting EPUB files to text.

The thing is that only ebooklib seems to be working properly. If I give it native EPUB file everything is working correctly. But If I try to pass to it filepath from mobi library then I receive bunch of errors that doesn't make much sense.

And I don't know what exactly is causing this. Maybe my MOBI files are encrypted somehow? (they are original books from Humble Bundle that I bought several months ago). But mobi library is not throwing any error about this.

Or maybe I cannot just pass filepath generated by mobi library as it is? Maybe I should somehow save this file, move it to some other folder and only then it will be "readable" by ebooklib?

My code looks like this:

import mobi

import ebooklib
from ebooklib import epub

tempdir, filepath = mobi.extract("book.mobi")

# This throws error:
book = epub.read_epub(filepath)

# Native, normal epub file is working ok:
book = epub.read_epub("book.epub")

Error isn't telling much in my opinion:

Traceback (most recent call last):
  File "/ebooklib/utils.py", line 35, in parse_string
tree = etree.parse(io.BytesIO(s.encode('utf-8')))
AttributeError: 'bytes' object has no attribute 'encode'

Solution

You can save it as html file

pip install mobi

than

import mobi
filepath="./example.mobi"
folder="./"

!mobiunpack -r   filepath folder

List of all options available here

Or here I propose another method:

pip install mobi
pip install html2text

import mobi
import html2text

filename="test.mobi"
tempdir, filepath = mobi.extract(filename)
file = open(filepath, "r")
content=file.read()
print(html2text.html2text(content))