I am using the following shell command to convert from EPUB
to Markdown
,
pandoc "input.epub" -t gfm-raw_html -o output.md
It's working well, except for the fact that EPUB
links remain, like:
[About the Author](#part0000_split_000.html_x9780698161863_EPUB)
Is there a pandoc option to remove those ?
The documentation provides example code to do precisely this:
What if we want to remove every link from a document, retaining the link’s text?
#!/usr/bin/env runhaskell -- delink.hs import Text.Pandoc.JSON main = toJSONFilter delink delink :: Inline -> [Inline] delink (Link _ txt _) = txt delink x = [x]
which can be used like:
pandoc "input.epub" -F delink.hs -t gfm-raw_html -o output.md