I just downloaded GhostPCL.
Here's how I am calling GhostPCL
:
> gpcl6win64.exe -sDEVICE=pdfwrite -o C:\temp\output.pdf C:\temp\input.spl
Input/Output
: Get it from my DropBox
The generated pdf
seems to be broken.
I cannot select text as expected
and when I copy the selected content to notepad it looks like this:
Am I missing something or is there a bug in GhostPCL
?
That's because PCL has very limited information about what a given character code is, in terms of another Encoding. Say, for example, Unicode.
Its entirely possible for a PCL page to download a custom subset font, and then use character codes which only work 'correctly' with that font.
For example, say that we embed the font in such a way that we set character code 1 for the first character we use, character code 2 for the second and so on. Then we send the text "Hello World"
That would then be represented in the PCL as
0x01 0x02 0x03 0x03 0x04 0x05 0x06 0x04 0x07 0x03 0x08
Obviously, that's not any kind of Encoding which makes sense, and PCL doesn't not have any means of carrying a Unicode mapping around.
Now, your PCL file contains several TrueType fonts, and its 'possible' that there is enough information in the CMAP subtables of the fonts to resurrect some kind of meaning from the 'text', but the GhostPCL doesn't have that kind of sophistication.
So no you aren't missing anything, and no there isn't a bug. Please note that the goal for pdfwrite is that the resulting PDF file should be visibly the same as the printed output, nothing more. Despite people's wishful thinking, PDF was never designed as an editable format and the vast majority of PDF files cannot be edited, nor can they reliably have 'text' extracted from them. Some will work, many don't.