vb.netitext7text-extractioninformation-extraction

Using iText7.Net, why dimensions returned by iText.Kernel.Geom.Point are 10 times greater that TextRenderInfo.LineSegment.GetStartPoint()?


I written a VB.Net program that intercept RENDER_PATH and RENDER_TEXT events generated by iText7 module.

I have written a little code to find location of TEXT.

    Dim ascent As LineSegment = t.GetAscentLine()
    Dim descent As LineSegment = t.GetDescentLine()

    Dim initX As Single = descent.GetStartPoint().Get(0)
    Dim initY As Single = descent.GetStartPoint().Get(1)
    Dim endX As Single = ascent.GetEndPoint().Get(0)
    Dim endY As Single = ascent.GetEndPoint().Get(1)

For specific PDF page, all values returned by GetStartPoint() and GetEndPoint() are between 20 and 600.

To find PATH values, I have writte following code

   Private Sub RenderPath(render As PathRenderInfo)
        For Each sp As Subpath In render.GetPath().GetSubpaths()
            Console.WriteLine(render.GetPath().ToString())
            For Each segment In sp.GetSegments()
                Console.WriteLine("  " & segment.ToString())
                Select Case segment.GetType().FullName
                    Case "iText.Kernel.Geom.Line"
                        Dim oLine As iText.Kernel.Geom.Line = segment
                        Dim oList As List(Of Point) = oLine.GetBasePoints()
                        Dim n = 0
                        For Each p In oList
                            Console.WriteLine("    p" & CStr(n) & ".x: " & CStr(oList(n).GetX()))
                            Console.WriteLine("    p" & CStr(n) & ".y: " & CStr(oList(n).GetY()))
                            n += 1
                        Next
                        Console.WriteLine("    width: " & CStr(oList(0).GetX() - oList(1).GetX()))
                        Console.WriteLine("    height: " & CStr(oList(0).GetY() - oList(1).GetY()))
                    Case "iText.Kernel.Geom.BezierCurve"
                    Case Else
                        Dim i0 = 0
                End Select
            Next
        Next
    End Sub

All location's values returned by GetX() and GetY() functions are now between ... 200 and 6000 !

Why PATH location's values seems to be 10 times greater that TEXT location's values ?

Is that normal or is that a BUG ?

In iText7, what are dimensions of TEXT locations and dimensions of PATH segments ?


Solution

  • In iText7, what are dimensions of TEXT locations and dimensions of PATH segments ?

    Indeed, the coordinates returned by TextRenderInfo and those returned by PathRenderInfo differ:

    That different render info classes return coordinates in conceptually different coordinate system probably isn't intuitive and should be made clearer.

    In case of your document page the CTM appears to be a scaling transformation by a factor of 0.1.