I have created a script that combines two PDFs into one side by side, by looking at some of Kurt Pfeifle's answers.
But my problem is that the code isn't flexible. By that I mean if one PDF is larger or has another resolution that the other PDF, the output PDF (side by side PDF) will be bad.
Illustrated it looks like this:
Input file: a.pdf
+--------+
| |
| a |
| |
+--------+
Input file: b.pdf
+--------+
| |
| b |
| |
+--------+
Desired output file: compare.pdf
+--------+--------+
| | |
| a | b |
| | |
+--------+--------+
So I need to make sure that the PDFs both have the same regular A4 size PDF and resolution before I combine them? I have tried so many codes and scripts, but can't figure this one out. How can I do that? The script needs to be bulletproof so that any PDFs can be used and compared. Even if they haven't got the same size.
My script look like this now and works on some PDFs with the same size and resolution:
gswin64c.exe ^
-o c.pdf ^
-sDEVICE=pdfwrite ^
-g11690x8270 ^
-dFIXEDMEDIA ^
-dPDFSETTINGS=/prepress ^
-r300 ^
-c "<</PageOffset [0 0]>>setpagedevice" ^
-f a.pdf
This creates c.pdf, looking like this:
c.pdf
+--------+--------+
| | |
| a | (empty)|
| | |
+--------+--------+
Next command:
gswin64c.exe ^
-o left-side-outputs.pdf ^
-sDEVICE=pdfwrite ^
-g11690x8270 ^
-dPDFSETTINGS=/prepress ^
-c "<</PageOffset [0 0]>>setpagedevice" ^
-f b.pdf
This creates left-side-outputs.pdf, looking like this:
left-side-outputs.pdf
+--------+--------+
| | |
| b | (empty)|
| | |
+--------+--------+
Next command:
gswin64c.exe ^
-o right-side-outputs.pdf ^
-sDEVICE=pdfwrite ^
-g11690x8270 ^
-dPDFSETTINGS=/prepress ^
-c "<</PageOffset [596 0]>>setpagedevice" ^
-f c.pdf
This creates right-side-outputs.pdf, looking like this:
right-side-outputs.pdf
+--------+--------+
| | |
|(empty) | b |
| | |
+--------+--------+
Last command:
pdftk left-side-outputs.pdf multistamp right-side-outputs.pdf output compare.pdf
This creates the final result, compare.pdf:
Desired output file: compare.pdf
+--------+--------+
| | |
| a | b |
| | |
+--------+--------+
I hope some gurus out there can help me figure out how to handle PDF input files with different page sizes.
To your question...
So I need to make sure that the PDFs both have the same regular A4 size PDF and resolution before I combine them?
...the answer is 'Yes, regarding the page size -- No regarding the resolution (doesn't matter).'
A command to scale all pages of a mixed-sized PDF to an all-A4 is this:
gswin64c.exe ^
-o all-a4.pdf ^
-sDEVICE=pdfwrite ^
-g5950x8420 ^
-dPDFFitPage ^
-f input.pdf
This scales media sizes and contents likewise (tested with GS v9.10).
The parameter -dPDFFitPage
will always keep the aspect ratio. It will automatically rotate the content to make the best fit. It does not allow 'stretching' or the page into one direction only. This can however be achieved with the next method.
I think one point about this method I did get across not clearly enough.
The thing is this: if the aspect ration of media from your input file is not already the same as your target media's, then the -dPDFFitPage
will not entirely cover your target media.
Assuming your input medium uses a square page size, 500x500
points. If you process this with a target size of A4 (-g5950x8420
), then the -dPDFFitPage
will keep the square aspece ratio and produce an output size of -g5950x5950
only.
But you cannot leave out -dPDFFitPage
either -- otherwise you don't get your original 400x400
content scaled, but only placed on the bigger 595x842
page, placed into the lower left corner.
End of update.]
A command to scale all PDF page contents to 50% of both their respective dimensions is this:
gswin64c.exe ^
-o 50pc.pdf ^
-sDEVICE=pdfwrite ^
-c "<</Install {.5 .5 scale}>> setpagedevice" ^
-f input.pdf
However, this will NOT scale the media boxes at the same time!
If you know that all pages in your PDF file are of the same size, you could use this to scale an A3 PDF to A4:
gswin64c.exe ^
-o A4-50pc.pdf ^
-g5950x8420 ^
-sDEVICE=pdfwrite ^
-c "<</Install {.5 .5 scale} /AutoRotatePages /None>> setpagedevice" ^
-f A3.pdf
However, the first command in my answer will of course also work, and it is more simple to use!
For A5 -> A4 or A4 -> A3 use:
{1.415 1.415 scale}
For A3 -> A4 or A4 -> A5:
{ .707 .707 scale}
But here it gets more interesting now, because you can 'stretch' the contents as well! To scale horizontally to 75% and vertically to 66%, use
-c "<</Install {.75 .666 scale}>> setpagedevice"
For a kind of 'liquid' scaling between Letter and A4, you may use these:
{1.028571 .940617 scale}
{ .972222 1.063131 scale}
For all of the above you can give a -gNNNNxMMMM
value (determining a fixed page size for the output PDF -- dimensions in pixels at the default internal resolution of the pdfwrite
device, which is 720 ppi, giving for 1 PostScript point 10 pixels...)-
If you do not give a -gNNNNxMMMM
value, the original page sizes are used (even if they are of mixed values), but their content will be drawn upon these pages with the scaling factor you specified.
What I do not know right now: A method to 'liquid-scale' each individual page of a mixed sized PDF including the media sizes in one go...
Assuming you now want to compare an all-Letter sized PDF to one which is all-A5, and you want to scale both to A4 first, here is what you'd do:
gswin64c.exe ^
-o a4-1.pdf ^
-sDEVICE=pdfwrite ^
-g5950x8420 ^
-c "<</Install{.972222 1.063131 scale}>>setpagedevice" ^
-f letter.pdf
gswin64c.exe ^
-o a4-2.pdf ^
-sDEVICE=pdfwrite ^
-g5950x8420 ^
-c "<</Install{1.415 1.415 scale}>>setpagedevice" ^
-f a5.pdf
or, alternatively:
gswin64c.exe ^
-o a4-2.pdf ^
-sDEVICE=pdfwrite ^
-g5950x8420 ^
-dPDFFitPage ^
-f a5.pdf
And now compare both your A4 PDF files....
You can also save one step of the workflow as outlined in your question. Here is a better approach.
Assuming you have A4 input, and the final output should be A3:
gswin64c.exe ^
-o left-sides.pdf ^
-sDEVICE=pdfwrite ^
-g11900x8420 ^
-c "<</PageOffset [0 0]>>setpagedevice" ^
-f a.pdf
This creates:
left-sides.pdf
+--------+--------+ ^
| | | |
| | | |
| a |(empty) | 595 pt == 5950 pixels
| | | |
| | | |
+--------+--------+ v
<-----1190 pt----->
== 11900 pixels
gswin64c.exe ^
-o right-sides.pdf ^
-sDEVICE=pdfwrite ^
-g11900x8420 ^
-c "<</PageOffset [595 0]>>setpagedevice" ^
-f b.pdf
This creates:
right-side.pdf
+--------+--------+ ^
| | | |
| | | |
|(empty) | b | 595 pt == 5950 pixels
| | | |
| | | |
+--------+--------+ v
<-----1190 pt----->
== 11900 pixels
pdftk
pdftk right-sides.pdf multistamp left-sides.pdf output compare.pdf
or
pdftk left-sides.pdf multistamp right-sides.pdf output compare2.pdf
This creates:
compare.pdf
+--------+--------+ ^
| | | |
| | | |
| a | b | 595 pt == 5950 pixels
| | | |
| | | |
+--------+--------+ v
<-----1190 pt----->
== 11900 pixels
One more thing.
Sometimes above commands may not "seem" to work. The reason is, that PDFs do internally not only use the naìvely assumed "page size", but a more complex setup of MediaBox
(what we usually regard as "page size"), as well as TrimBox
, BleedBox
, ArtBox
and CropBox
. See here for an exact description of these boxes...
To test your PDFs files (inputs as well as results or intermediate results) for all these boxes' values, use the pdfinfo
command:
pdfinfo -f 1 -l 5 -box a.pdf
pdfinfo -f 1 -l 5 -box b.pdf
pdfinfo -f 1 -l 5 -box right-sides.pdf
pdfinfo -f 1 -l 5 -box left-sides.pdf
pdfinfo -f 1 -l 5 -box compare.pdf
The CropBox
makes PDF viewers (and printers) to only display (or print) that part of the content which is on the MediaBox
, if it is defined differently from the MediaBox
can get into the way of the rescaling task. It will not be touched by Ghostscript, if it sees one.
It can happen that the file was processed succesfully, but in the viewer it still shows you the same viewport onto the page.
In order to "disarm" the effect of these boxes, you should can use a very crude trick: rename these strings within the PDF to all-lowercase names. Here is how to do it with the sed
commandline (may not be available on Windows):
cat input.pdf \
| sed 's#CropBox#cropbox#g' \
| sed 's#TrimBox#trimbox#g' \
| sed 's#BleedBox#bleedbox#g' \
| sed 's#ArtBox#artbox#g' \
> disarmed.pdf
or, somehow shorter, but not as easy to parse:
sed 's#CropB#cropb#g;s#TrimB#trimb#g;s#BleedB#bleedb#g;s#ArtB#artb#g' \
in.pdf > out.pdf
Since Ghostscript is a binary file format, with some versions of sed
you may encounter an error message saying:
sed: RE error: illegal byte sequence
In this case try a different flavor, like GNU sed, gsed
...