pythonlatexr-markdownbookdown

Python Code and Output in Bookdown pdf are not in multiple lines


I am trying to write python code using rmarkdown in bookdown. The python code is ok. The problem is when the book pdf is generated, some long python codes and sometimes some python codes' output are outside of the pdf page and therefore they are not visible. Please see the images below.

In the image you can see the print ('The total number of rows and columns in the dataset is {} and {} respectively.'.format(iris_df.shape[0],iris_df.shape[1])) function code is not fully visible, but the output is visible. Another case, for new_col = iris_df.columns.str.replace('\(.*\)','').str.strip().str.upper().str.replace(' ','_') code, the whole code line is not visible and also the output of the code. The same issue is in sns.scatterplot () line of code.

I am just wondering whether there is anyway in bookdown pdf, both the code and the associated output will not be outside of the pdf page.

Note: I tried to write python code in rmarkdown in multiple lines, but it did not work and most cases the codes are not executed when python codes are written in multiple lines in rmarkdown.

pdfoutput1

Here is the code that I used to generate the output in the image

from sklearn import datasets
iris = datasets.load_iris()
iris.keys()
iris_df = pd.DataFrame (data = iris.data, columns = iris.feature_names)
iris_df['target'] = iris.target

iris_df.sample(frac = 0.05)
iris_df.shape
print ('The total number of rows and columns in the dataset is {} and {} respectively.'.format(iris_df.shape[0],iris_df.shape[1]))
iris_df.info()

new_col = iris_df.columns.str.replace('\(.*\)','').str.strip().str.upper().str.replace(' ','_')
          
new_col
iris_df.columns = new_col
iris_df.info()

sns.scatterplot(data = iris_df, x = 'SEPAL_LENGTH', y = 'SEPAL_WIDTH', hue = 'TARGET', palette = 'Set2')
plt.xlabel('Sepal Length'),
plt.ylabel('Sepal Width')
plt.title('Scatterplot of Sepal Length and Width for the Target Variable')
plt.show()

Solution

  • I do not know why writing python code in multiple lines did not work for your case, whether have you tried in the right way (since you didn't provide much info regarding that).

    From the PEP 8 – Style Guide for Python Code

    The preferred way of wrapping long lines is by using Python's implied line continuation inside parentheses, brackets and braces. Long lines can be broken over multiple lines by wrapping expressions in parentheses. These should be used in preference to using a backslash for line continuation

    So if you write code by following the above suggestion, code should run fine in the rmarkdown (or in bookdown) too.

    Also along with that, you can try to reduce the font size a bit for source code and output using latex packages and commands (since your intended output format is pdf). And latex package fvextra provides some nice options for reducing font sizes or even auto line wrapping for long code lines.

    Therefore, keeping all of these in mind, try the followings,

    (Note that how I have wrapped all of the long lines inside the parenthesis)

    intro.Rmd

    # Hello bookdown 
    
    ```{r setup, include=FALSE}
    library(reticulate)
    # reticulate::py_install(c("scikit-learn","pandas", "matplotlib", "seaborn"))
    use_virtualenv("r-reticulate/")
    ```
    
    
    ```{python}
    import pandas as pd
    import seaborn as sns
    from sklearn import datasets
    import matplotlib.pyplot as plt
    ```
    
    ```{python}
    iris = datasets.load_iris()
    iris_df = pd.DataFrame (data = iris.data, columns = iris.feature_names)
    iris_df['target'] = iris.target
    
    (print(
      'The total number of rows and columns in the dataset is {} and {} respectively.'
      .format(iris_df.shape[0],iris_df.shape[1])))
    ```
    
    
    ```{python}
    new_col = (iris_df.columns
                .str
                .replace('\(.*\)','')
                .str.strip()
                .str.upper()
                .str.replace(' ','_'))
    new_col
    ```
    
    \newpage
    
    ```{python}
    iris_df.columns = new_col
    sns.scatterplot(
      data = iris_df, 
      x = 'SEPAL_LENGTH_(CM)', 
      y = 'SEPAL_WIDTH_(CM)', 
      hue = 'TARGET', 
      palette = 'Set2')
    plt.xlabel('Sepal Length'),
    plt.ylabel('Sepal Width')
    plt.title('Scatterplot of Sepal Length and Width for the Target Variable')
    plt.show()
    
    ```
    

    And add the lines in your preamble.tex file,

    \usepackage{fvextra}
    \DefineVerbatimEnvironment{Highlighting}{Verbatim}{commandchars=\\\{\},fontsize=\footnotesize}
    
    \makeatletter
    \def\verbatim{\footnotesize\@verbatim \frenchspacing\@vobeyspaces \@xverbatim}
    \makeatother
    

    If you need bigger or smaller font size than this, try with small or scriptsize.

    Then use that preamble.tex file in the includes in header in the _output.yml file,

    bookdown::pdf_book:
      includes:
        in_header: preamble.tex
      latex_engine: xelatex
      citation_package: natbib
      keep_tex: yes
    
    

    rendered pdf output

    page one

    page two