node.jscentospdf2image

Using pdf2image with Node.js and CentOS


I'm using pdf2image to build a Node.js application that convert PDF file to PNG. As the readme of the official repo says, pdf2image requires two external dependencies: Ghostscript and GraphicsMagick. The two are installed in my local Windows machine.

Now I'm able to convert a buffer of PDF file to a buffer of PNG image with this code:

const fromBuffer = require('pdf2pic').fromBuffer;

convertPdfToImg = async (pdfBuffer) => {
    const pdf2picOptions = {
        format: 'png',
        width: 4000,
        height: 5176,
        density: 330,
        savePath: './output',
    };
    const convert = fromBuffer(pdfBuffer, pdf2picOptions);
    const pageOutput = await convert(1, true);
    const pngBuffer = Buffer.from(pageOutput.base64, 'base64');
    return pngBuffer;
};

Everything works fine and everyone is happy! (For now)

The problem right now is that I need to deploy the application to production in a Linux environment (CentOS Stream), so I installed the dependencies on the server making sure to use the same versions installed on my local Windows machine (Ghostscript 9.52 and GraphicsMagick 1.3.35 2020-02-23 Q16). However, the code snippet mentioned above won't work anymore and returns an empty buffer.

After some debugging, I noticed that pageOutput.base64 is empty, which means probably Ghostscript and GraphicsMagick are not installed correctly (because I tested the code without the dependencies in Windows and gives me empty pageOutput.base64 as well).

I checked again whether Ghostscript is installed in Centos by typing gs --version and it gave 9.52 as output.
Also, I checked GraphicsMagick by typing gm version which gave me the following output:

GraphicsMagick 1.3.35 2020-02-23 Q16 http://www.GraphicsMagick.org/
Copyright (C) 2002-2020 GraphicsMagick Group.
Additional copyrights and licenses apply to this software.
See http://www.GraphicsMagick.org/www/Copyright.html for details.

Feature Support:
  Native Thread Safe         yes
  Large Files (> 32 bit)     yes
  Large Memory (> 32 bit)    yes
  BZIP                       no
  DPS                        no
  FlashPix                   no
  FreeType                   no
  Ghostscript (Library)      no
  JBIG                       no
  JPEG-2000                  no
  JPEG                       no
  Little CMS                 no
  Loadable Modules           no
  Solaris mtmalloc           no
  Google perftools tcmalloc  no
  OpenMP                     yes (201511 "4.5")
  PNG                        no
  TIFF                       no
  TRIO                       no
  Solaris umem               no
  WebP                       no
  WMF                        no
  X11                        no
  XML                        yes
  ZLIB                       yes

Host type: x86_64-pc-linux-gnu

Configured using the command:
  ./configure  '--with-quantum-depth=16'

Final Build Parameters:
  CC       = gcc
  CFLAGS   = -fopenmp -g -O2 -Wall -pthread
  CPPFLAGS = -I/usr/include/libxml2
  CXX      = g++
  CXXFLAGS = -pthread
  LDFLAGS  =
  LIBS     = -llzma -lxml2 -lz -lm -lpthread

Note that the two dependencies are installed directly from source.

So my question is: How to make pdf2image works with Node.js deployed on linux system (specifically CentOS Stream)? Is there something wrong with my dependencies installation in CentOS?

Thanks.


Solution

  • I ended up using pdf-to-png-converter, It requires 0 external dependencies and works perfectly in both Windows and CentOS.

    Here is how I did it:

    const pdfToPng = require('pdf-to-png-converter').pdfToPng;
    
    convertPdfToImg = async (buffer) => {
        const pngPage = await pdfToPng(buffer, {
            disableFontFace: false,
            useSystemFonts: false,
            pagesToProcess: [1],
            viewportScale: 2.0
        });
        return pngPage[0].content;
    }
    

    I wish I knew this awesome utility earlier.