I'm trying to extract the data from an image using pytesseract
. This module has image_to_data
and image_to_osd
methods. These two methods provide lots of info (TextLineOrder
, WritingDirection
, ScriptDetection
, Orientation
, etc...) as output.
The image below is the output of the image_to_data
method. What do the values of these columns (level
, block_num
, par_num
, line_num
, word_num
) mean?
The output of image_to_osd looks as presented below. What is the meaning each term in it?
Page number: 0
Orientation in degrees: 0
Rotate: 0
Orientation confidence: 16.47
Script: Latin
Script confidence: 4.00
I referred to docs but I did not find any info regarding these parameters.
Column Level:
Column block_num: Block number of the detected text or item
Column par_num: Paragraph number of the detected text or item
Column line_num: Line number of the detected text or item
Column word_num: word number of the detected text or item
But above all 4 columns are interconnected.If the item comes from new line then word number will start counting again from 0, it doesn't continue from previous line last word number. Same goes with line_num, par_num, block_num.
Check out the below image for reference.
1st column: block_num
2nd column: par_num
3rd column: line_num
4rth column: word_num