pdfms-wordr-markdownknitrpage-break

page break for pdf and word in rmarkdown


I am trying to develop a rmarkdown report for my data analysis that could be knitted both in word_document and pdf_document. Bookdown works really well for captions and automatic numbering (https://bookdown.org/yihui/bookdown/). The only main issue left is how to do page breaks that could work for both.

For pdf, i use xelatex from tinytex and \newpage works great. For Word, I use section 5 page break and customize the style (incl. page break and white font).

I could use Edit > Find... and Replace All, but as I am still developing the report and need to test frequently that the output looks great in both formats.

Is there any way I could either:

Thanks!

Here is a reproducing example of R Markdown file:

---
title: "Untitled"
author: "Me"
date: "November 15, 2018"
output:
  pdf_document: default
  word_document: default
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
Some text.  

I want a page break after this.

\newpage
##### page break

This should be the first sentence of the new page.

Some more text.

Solution

  • Many thanks to tarleb for the answer. As suggested I used your answer to this post: https://stackoverflow.com/a/52131435/2425163.

    step 1: create a txt file with the following code:

    --- Return a block element causing a page break in the given format.
    local function newpage(format)
      if format == 'docx' then
        local pagebreak = '<w:p><w:r><w:br w:type="page"/></w:r></w:p>'
        return pandoc.RawBlock('openxml', pagebreak)
      elseif format:match 'html.*' then
        return pandoc.RawBlock('html', '<div style=""></div>')
      elseif format:match '(la)?tex' then
        return pandoc.RawBlock('tex', '\\newpage{}')
      elseif format:match 'epub' then
        local pagebreak = '<p style="page-break-after: always;"> </p>'
        return pandoc.RawBlock('html', pagebreak)
      else
        -- fall back to insert a form feed character
        return pandoc.Para{pandoc.Str '\f'}
      end
    end
    
    -- Filter function called on each RawBlock element.
    function RawBlock (el)
      -- check that the block is TeX or LaTeX and contains only \newpage or
      -- \newpage{} if el.format:match '(la)?tex' and content:match
      -- '\\newpage(%{%})?' then
      if el.text:match '\\newpage' then
        -- use format-specific pagebreak marker. FORMAT is set by pandoc to
        -- the targeted output format.
        return newpage(FORMAT)
      end
      -- otherwise, leave the block unchanged
      return nil
    end
    

    step 2: save the file as page-break.lua in the same directory with my R Markdown file.

    step 3: add the link as pandoc argument.

    This the reproducible example (R Markdown file) corrected:

    ---
    title: "Untitled"
    author: "Me"
    date: "November 15, 2018"
    output:
      pdf_document: default
      word_document:
        pandoc_args:
         '--lua-filter=page-break.lua'
    ---
    
    ```{r setup, include=FALSE}
    knitr::opts_chunk$set(echo = TRUE)
    ```
    
    Some text.  
    
    I want a page break after this.
    
    \newpage
    
    This should be the first sentence of the new page.
    
    Some more text.
    

    Please note that this may not work for the toc, but i don't use the lua filter with pdf and with word _document it's very easy to add the table of content afterwards directly in Word. Plus there is a link to a solution for that problem in the above link.