cpdfbinaryreaderpdf-manipulation

Edit or remove PDF header information in C


I need to edit the header information of several PDF files. What I'd like to achieve is to remove all header data before %PDF-X.Y.Z.

What I came up with as a possible solution was to open the PDF in binary mode, read each character until %PDF-X.Y.Z is found. Then continue reading the rest of the stream and save it to a new file. I thought this way I will end up with an exact binary copy of the PDF, just with different header information.

What's the easiest/best way to do this in C? Are there any libraries available that could help me do this? I'm also interested in hearing different approaches to solve this problem.

Thanks.


Solution

  • Assuming that stripping the beginning of the file really does solve your problem, all you need are fopen, fread, fwrite and fclose.

    You open the file for reading in binary mode. Read up until you find the magic %PDF string. Open the output file for binary writing. Write out to that file, starting with your new %PDF string. When you are done writing, close both files.