I am currently playing around with Hakyll and Pandoc.
I want to create a static HTML website from Markdown sources including inline maths in LaTeX. Using pandoc-katex I was able to do the conversion with the following command:
$ pandoc -f markdown -t html --filter pandoc-katex --css "https://cdn.jsdelivr.net/npm/katex@$(pandoc-katex --katex-version)/dist/katex.min.css" --css "https://pandoc.org/demo/pandoc.css" --standalone -o output.html input.md
However, I want to use the pandoc-katex
filter in Hakyll and obtain the exact same result as with the command above (for now), i.e. I want to use Pandoc's standard HTML template, make it load the two CSS files and process any available metadata in the input.md
in exactly the same way as the command above does.
I exported the standard HTML template as follows:
$ pandoc -D html > default-template.html
Using pandocCompilerWithTransformM
, I was able to use the pandoc-katex
filter:
katexCompiler = pandocCompilerWithTransformM defaultHakyllReaderOptions (defaultHakyllWriterOptions) katexFilter
where katexFilter = recompilingUnsafeCompiler
. runIOorExplode
. applyFilters noEngine def [JSONFilter "pandoc-katex"] []
Using this compiler in Hakyll, I only get the body part of the HTML file though. I searched online for solutions to this, but all the information that I find seems to refer to deprecated versions of Pandoc. Apparently there was a writerStandalone
option in earlier versions of Pandoc, but it does not exist anymore (even though the command line tool still has opStandalone
and the --standalone
parameter used above evidently works).
What I currently do is, I apply the default template with loadAndApplyTemplate "templates/default-template.html" myCtx
and then try to manually replicate the default context in myCtx
. This is obviously not how it should be done.
Here is a somewhat minimal example of my attempt (sorry that it's still a bit lengthy - exactly that is the problem):
{-# LANGUAGE OverloadedStrings #-}
import Text.Pandoc
import Text.Pandoc.Filter
import Text.Pandoc.Scripting
import Hakyll
css1Item = Item (fromFilePath "css/katex.min.css") "https://cdn.jsdelivr.net/npm/katex@0.16.4/dist/katex.min.css"
css2Item = Item (fromFilePath "css/pandoc.css") "https://pandoc.org/demo/pandoc.css"
authorItem = Item (fromFilePath "general") "Jon Doe"
stylesString = "/* 15 lines of CSS */"
myCtx :: Context String
myCtx = dateField "date" "%B %e, %Y"
<> constField "pagetitle" "My Title"
<> constField "styles.html" stylesString
<> listCtx "author" [authorItem]
<> listCtx "author-meta" [authorItem]
<> listCtx "css" [css1Item, css2Item]
<> listCtx "header-includes" []
<> listCtx "include-before" []
<> listCtx "include-after" []
<> defaultContext
listCtx :: String -> [Item String] -> Context String
listCtx name lst = listField name ctx (return $ lst)
where ctx = field name (return . itemBody)
katexCompiler = pandocCompilerWithTransformM defaultHakyllReaderOptions (defaultHakyllWriterOptions) katexFilter
where katexFilter = recompilingUnsafeCompiler
. runIOorExplode
. applyFilters noEngine def [JSONFilter "pandoc-katex"] []
main :: IO ()
main = hakyll $ do
match "templates/default-template.html" $ compile templateBodyCompiler
match "input.md" $ do
route $ setExtension ".html"
compile $ katexCompiler
>>= loadAndApplyTemplate "templates/default-template.html" myCtx
I have a two concrete questions:
Item
data type associates keys of type Identifier
with values. The constructors for Identifier
suggest that the Identifier
s should be file names, but for some of the Item
s in my context, (e.g. for the author
field; see variable authorItem
), having a file name as a key does not make sense. I think I misinterpreted the purpose of this type. How should I think of these Item
s?Context
that the command line tool uses, when making the conversion? The default Context
seems to be a lot more involved than my quick draft, e.g. it reads the abstract from the metadata of the Markdown file and puts every paragraph in between separate <p> ... </p>
HTML tags. I know there is a metadataField :: Context a
, but it does not seem to be what I want.Apart from these concrete questions, the general question is:
The nicest way to do that is probably using writerTemplate
in Pandoc's WriterOptions
to pass the default template, as given by compileDefaultTemplate
:
main :: IO ()
main = do
pandocTmpl <- runIOorExplode $ compileDefaultTemplate "html"
let katexOpts = defaultHakyllWriterOptions
{ writerTemplate = Just pandocTmpl
, writerHTMLMathMethod = KaTeX ""
-- And whatever else you need.
}
-- Defining it this way because pandocCompilerWith strips
-- the metadata block before handing the body to Pandoc.
--
-- I'm relying on Pandoc's built-in KaTeX support. If
-- you'd rather stick with the pandoc-katex filter, you
-- can use renderPandocWithTransformM to reshape the
-- compiler you defined in the question in this fashion.
katexCompiler = do
fullItem <- getResourceString
renderPandocWith defaultHakyllReaderOptions katexOpts fullItem
hakyll $ do
-- etc.
match "input.md" $ do
route $ setExtension ".html"
compile katexCompiler
See also pandoc issue #10209, which points to a similar approach.
Side questions:
How should I think of these
Item
s?
Item
indeed is primarily meant for things bound to a file path in your site tree. Occasionally, it makes sense to use a fake path for the identifier — for instance, when synthesising some content with a create
rule. However, that's not typically something one would want to do for the sake of setting a context field, as there likely are more straightforward ways to do that. (In particular, if, unlike in this answer, you are using Hakyll's templates, you don't have to explicitly define the fields that you include in the metadata headers of your source files, as Hakyll's defaultContext
covers that already by including metadataField
.)
Is there a way to obtain the
Context
that the command line tool uses, when making the conversion?
While Pandoc offers ways to manipulate its own metadata (which I have never used myself; Text.Pandoc.Writers.Shared
might be a good place to start browsing), the template systems of Pandoc and Hakyll are similar-looking but distinct, and in particular Hakyll's Context
type is not the same as its Pandoc counterpart.
On a final note, it is worth mentioning that if you were completely stuck trying to reproduce Pandoc's output within Hakyll, a last resort would be using unixFilter
to set up a compiler that shells out to command-line Pandoc.