I am having much trouble to get ImageMagick's identify
to, well, identify a PDF as CMYK.
Essentially, let's say I'm building this file, test.tex
, with pdflatex
:
\documentclass[a4paper,12pt]{article}
%% https://tex.stackexchange.com/questions/13071
\pdfcompresslevel=0
%% http://compgroups.net/comp.text.tex/Making-a-cmyk-PDF
%% ln -s /usr/share/color/icc/sRGB.icm .
% \immediate\pdfobj stream attr{/N 4} file{sRGB.icm}
% \pdfcatalog{%
% /OutputIntents [ <<
% /Type /OutputIntent
% /S/GTS_PDFA1
% /DestOutputProfile \the\pdflastobj\space 0 R
% /OutputConditionIdentifier (sRGB IEC61966-2.1)
% /Info(sRGB IEC61966-2.1)
% >> ]
% }
%% http://latex-my.blogspot.com/2010/02/cmyk-output-for-commercial-printing.html
%% https://tex.stackexchange.com/questions/9961
\usepackage[cmyk]{xcolor}
\begin{document}
Some text here...
\end{document}
If I then try to identify the resulting test.pdf
file, I get it as RGB, no matter what options I've tried (at least according to the links in the source) - and yet, the colors in it would be saved as CMYK; for the source above:
$ grep -ia 'cmyk\|rgb\| k' test.pdf
0 0 0 1 k 0 0 0 1 K
0 0 0 1 k 0 0 0 1 K
0 0 0 1 k 0 0 0 1 K
0 0 0 1 k 0 0 0 1 K
FontDirectory/CMR12 known{/CMR12 findfont dup/UniqueID known{dup
/PTEX.Fullbanner (This is pdfTeX, Version 3.1415926-1.40.11-2.2 (TeX Live 2010) kpathsea version 6.0.0)
$ identify -verbose 'test.pdf[0]'
...
Type: Palette
Endianess: Undefined
Colorspace: RGB
Depth: 16/8-bit
Channel depth:
red: 8-bit
green: 8-bit
blue: 8-bit
Channel statistics:
Red:
...
Green:
...
Blue:
...
Histogram:
5: (12593,11565,11822) #31312D2D2E2E rgb(49,45,46)
4: (16448,15420,15677) #40403C3C3D3D rgb(64,60,61)
9: (20303,19275,19532) #4F4F4B4B4C4C rgb(79,75,76)
25: (23901,23130,23387) #5D5D5A5A5B5B rgb(93,90,91)
...
The same pretty much happens if I also uncomment that \immediate\pdfobj stream ...
part; and yet, if there is only one color (black) in the document, I don't see where does identify
come up with a histogram of RGB values (although, arguably, all of them close to gray) ?!
So nevermind this, then I though I'd better try to use ghostscript
to convert the test.pdf
into a new pdf, which would be recognized as CMYK by identify
- but no luck even there:
$ gs -dNOPAUSE -dBATCH -dSAFER -sDEVICE=pdfwrite -sOutputFile=test-gs.pdf -dUseCIEColor -sProcessColorModel=DeviceRGB -dProcessColorModel=/DeviceCMYK -sColorConversionStrategy=/CMYK test.pdf
GPL Ghostscript 9.01 (2011-02-07)
Copyright (C) 2010 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 1.
Page 1
$ identify -verbose 'test-gs.pdf[0]'
...
Type: Grayscale
Base type: Grayscale
Endianess: Undefined
Colorspace: RGB
Depth: 16/8-bit
...
So the only thing that identify
perceived as a change, is Type: Grayscale
(from previous Type: Palette
); but otherwise it still sees an RGB colorspace!
Along with this, note that identify
is capable of correctly reporting a CMYK pdf - see CMYK poster example: fitting pdf page size to (bitmap) image size? #17843 - TeX - LaTeX - Stack Exchange for a command line example of generating such a PDF file using convert
and gs
. In fact, we can execute:
convert test.pdf -depth 8 -colorspace cmyk -alpha Off test-c.pdf
... and this will result with a PDF that will be identify
ed as CMYK - however, the PDF will also be rasterized (default at 72 dpi).
EDIT: I have just discovered, that if I create an .odp presentation in OpenOffice, and export it to PDF; that PDF will by default be RGB, however, the following command (from ghostscript Examples | Production Monkeys):
# Color PDF to CMYK:
gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pdfwrite \
-sColorConversionStrategy=CMYK -dProcessColorModel=/DeviceCMYK \
-sOutputFile=output.pdf input.pdf
... actually will produce a CMYK pdf, reported as such by identify
(although, the black will be rich, not plain - on all four channels); however, this command will work only when the slide has an added image (apparently, it is the one triggering the color conversion?!)! Funnily, I cannot get the same effect from a pdflatex
PDF.
So I guess my question can be asked two ways:
identify
(and will consequently build a correct histogram of CMYK colors)identify
, which would recognize use of CMYK colors correctly even in the original test.pdf
from pdflatex
(and possibly build a color histogram, based on an arbitrarily chosen PDF page, like identify
is supposed to)?Thanks in advance for any answers,
Cheers!
Some references:
Is it for instance specified as "0 0 0 1 setcmykcolor"? Or possibly rather as "0 0 0 setrgbcolor"? In the latter case you would end up with a rich black for text, if DeviceRGB is remapped to a CIE-based color space in order to get RGB images color managed.
sdaau, the command you used for trying to convert your PDF to CMYK was not correct. Try this one instead:
gs \
-o test-cmyk.pdf \
-sDEVICE=pdfwrite \
-sProcessColorModel=DeviceCMYK \
-sColorConversionStrategy=CMYK \
-sColorConversionStrategyForImages=CMYK \
test.pdf
If color conversion does not work as desired and if you see a message like "Unable to convert color space to Gray, reverting strategy to LeaveColorUnchanged" then...
In this case add -dOverrideICC
to the command line and see if it changes the result as desired.
To avoid JPEG artifacts appearing in the images (where there were none before), add:
-dEncodeColorImages=false
into the command line.
(This is true for almost all GS PDF->PDF processing, not just for this case. Because GS by default creates a completely new file with newly constructed objects and a new file structure when asked to produce PDF output -- it doesn't simply re-use the previous objects, as a more "dumb" PDF processor like pdftk
does {pdftk
has other advantages though, don't misunderstand my statement!}. GS applies JPEG compression by default -- look at the current Ps2pdf documentation and search for "ColorImageFilter" to learn about more details...)