Can someone explain me a special PDF matrix transformation? I just can't figure it out. I have the following content stream of a PDF:
q
q
1.0000 0.0000 0.0000 1.0000 0.0000 0.0000 cm
q
1 0 0 -1 0 841.8898 cm
1.0000 0 0 1.0000 0 0 cm
q
141.84 0 0 70.80 376.68 63.60 cm
/IM1 Do
Q
BT
/F1 15.96 Tf
-0.00 Tc
1. 0. 0. -1. 0. -0. Tm
56.64 -86.4 Td
(Rechnung) Tj
ET
Let's ignore the image that is placed at the beginning.
According to my calculations based on the Acrobat PDF Specification 1.7, the text "Rechnung" is rendered at the following user space position:
point (userspace) = (56.64, 928.2898);
The matrices before the Tj operator are as follows:
ctm = [1.0, 0.0, 0.0, -1.0, 0.0, 841.8898]
tm = [1.0, 0.0, 0.0, -1.0, 56.64, -86.4]
--> usm = tm * ctm = [1.0, 0.0, 0.0, 1.0, 56.64, 928.2898] // user space matrix
However, the user space point lies outside the A4 area, namely at y = 928, which would fall outside the visible A4 area. Yet Acrobat still renders it on the A4 page.
I’m currently building a parser and want to determine exactly which text appears in a given region (e.g. [x=56.0, y=587.0, w=241.0, h=128.0] = DIN 5008 address window). I might want to reposition such elements.
With “normal” CTMs and TMs, everything works fine. But in this case, I can’t make sense of the rendering. Acrobat must be applying some kind of additional correction/conversion so that the y = 928 position ends up fitting on the A4 page, specifically at y = 754. Does anybody know what exactly is being done here?
For this special case, I could calculate:
y' = 841 - (y - 841) = 2 * 841 - y
y' = 754
(So the word “Rechnung” lies outside the address region.)
Or in general:
y' = 2 * mediaBox.height - y
That would be the position in the normal PDF coordinate system. I’d like to apply this method correctly for arbitrary CTMs. And I believe the correction above only needs to be applied in specific CTM configurations where the coordinate system was transferred before.
The matrices before the Tj operator are as follows:
ctm = [1.0, 0.0, 0.0, -1.0, 0.0, 841.8898] tm = [1.0, 0.0, 0.0, -1.0, 56.64, -86.4] --> usm = tm * ctm = [1.0, 0.0, 0.0, 1.0, 56.64, 928.2898] // user space matrix
You made a mistake calculating Tm.
Applying tx ty Td is specified as
this operator shall perform these assignments:
(ISO 32000-2 Table 106 — Text-positioning operators)
In your case:
I.e. the last entry in your tm
is not negative.
The final matrix, therefore, is