node.jspdfghostscriptinkscapepdftk

Convert PDF to get vectorized text ("convert all text to outlines")


I'm using nodejs and I'm processing PDFs. One thing I'd like to do is to outline all the fonts of the PDF (so that they are not selectable with the mouse cursor afterwards).

I tried the pdftk's flatten command (using a node wrapper), but I did not get what I wanted.

I may have a track in using inkscape (command line), but I'm not even sure about how to do it. I really am looking for the easiest way to do that using nodejs.

There might also be a track using ghostscript: https://stackoverflow.com/a/28798374/11348232. One notable thing to notice is that I don't use files on disk, but Buffer objects, so it'd be painful to save the PDF locally then use the gs command.

Thanks a lot.


Solution

  • I finally followed @KenS way:

    import util from 'util';
    import childProcess from 'child_process';
    import fs from 'fs';
    import os from 'os';
    import path from 'path';
    import { v4 as uuidv4 } from 'uuid';
    
    const exec = util.promisify(childProcess.exec);
    
    const unlinkCallback = (err) => {
      if (err) {
        console.error(err);
      }
    };
    
    const deleteFile = (path: fs.PathLike) => {
      if (fs.existsSync(path)) {
        fs.unlink(path, unlinkCallback);
      }
    };
    
    const createTempPathPDF = () => path.join(os.tmpdir(), `${uuidv4()}.pdf`);
    
    const convertFontsToOutlines = async (buffer: Buffer): Promise<Buffer> => {
      const inputPath = createTempPathPDF();
      const outputPath = createTempPathPDF();
      let bufferWithOutlines: Buffer;
    
      fs.createWriteStream(inputPath).write(buffer);
    
      try {
        // ! ghostscript package MUST be installed on system
        await exec(`gs -o ${outputPath} -dNoOutputFonts -sDEVICE=pdfwrite ${inputPath}`);
    
        bufferWithOutlines = fs.readFileSync(outputPath);
      } catch (e) {
        console.error(e);
    
        bufferWithOutlines = buffer;
      }
    
      deleteFile(inputPath);
      deleteFile(outputPath);
    
      return bufferWithOutlines;
    };