pdfxpdf

How to identify and extract vector graphics from PDF using xpdf library?


Does anyone have a sample code demonstrating how to extract vector graphics objects (such as those representing charts and flow diagrams) from a PDF using XPDF library? There doesn't seem to be any documentation available on the Web for xpdf library nor could I find any any sample code that uses the library to extract information from PDF. I am going through xpdf's code base but any pointers to its documentation or a sample code would be very helpful.


Solution

  • OutputDev class has stroke, fill, clip ... virtual members definitions. Just implement those and extract path and colour information from GfxState. You'll find path iteration in OutputDev based classes in xpdf code base such as TextOutputDev or ImageOutputDev

    edit: This outputdev may give you the example you need