The resulting output: a txt file with empty lines.
The expected output: a txt file with words of "Привет Мир! Это я, обычный неработающий текст или рыба" text.
What am I doing wrong? Tried nested xsl:for-each code gives out the same kind of behavior.
I see 2 problems in your attempt:
Your instruction:
<xsl:for-each select="//div [@class='ocr_page'] /div [@class='ocr_carea'] / p [@class='ocr_par'] / span[@class='ocr_line'] / span [@class='ocrx_word']">
selects nothing, because your input XML puts all its elements in a namespace. See here how to solve this.
Once you have it working, this instruction will put you in the
context of span
. From this context, your next instruction:
<xsl:value-of select="normalize-space(span [@class='ocrx_word'])" disable-output-escaping="yes"/>
also selects nothing, because span
is not a child of itself. It
should be:
<xsl:value-of select="normalize-space(.)"/>
and I doubt you want to disable output escaping in a stylesheet producing an XML result.