schematron

Schematron strip trailing period and space at the end of the text


Does anyone know how to remove the trailing period and space when the text node has children?

i/p xml:

   <ul>
     <li>example1. </li>
     <li>example2.</li>
      <li>xyz size. <ph>567</ph> 1. <ph>9</ph>mm.</li>
      <li>abc size. <ph>1234</ph> 1. <ph>9</ph>mm. </li>
      <li>def size.<ph>123</ph> 3.<ph>5</ph>mm.</li>
   </ul>

The below code doesn't work properly when text has child elements.

Schematron:

       <sch:pattern>
        <sch:rule context="li//text()">
            <sch:report test="matches(., '(\w+)\.\s*$')" sqf:fix="listPeriod" role="warning">List
                should not end with a period</sch:report>
            <sqf:fix id="listPeriod" use-when="matches(., '(\w+)\.\s*$')">
                <sqf:description>
                    <sqf:title>Remove end period</sqf:title>
                </sqf:description>
                <sqf:stringReplace regex="(\w+)\.\s*$" select="'$1'"/>
            </sqf:fix>
        </sch:rule>
    </sch:pattern>

o/p:

   <ul>
      <li>example1</li>
      <li>example2</li>
      <li>xyz size<ph>567</ph> 1<ph>9</ph>mm</li>
      <li>abc size<ph>1234</ph> 1<ph>9</ph>mm</li>
      <li>def size<ph>123</ph> 3<ph>5</ph>mm</li>
   </ul>

desired o/p:

   <ul>
     <li>example1</li>
     <li>example2</li>
      <li>xyz size. <ph>567</ph> 1. <ph>9</ph>mm</li>
      <li>abc size. <ph>1234</ph> 1. <ph>9</ph>mm</li>
      <li>def size.<ph>123</ph> 3.<ph>5</ph>mm</li>
   </ul>

Thanks!!


Solution

  • Fixing in mixed-content is always hard, but in your case you can just fix the last text node in a li element.

    First of all, you should use the li as context to test the content at once and not each text node inside:

    <sch:rule context="li">
    

    You should add a match to the sqf:stringReplace fixing only the last text node inside:

    <sqf:stringReplace  match="(.//text())[last()]"/>
    

    That would be the whole pattern:

    <sch:pattern>
        <sch:rule context="li">
            <sch:report test="matches(., '(\w+)\.\s*$')" sqf:fix="listPeriod" role="warning">List
                should not end with a period</sch:report>
            <sqf:fix id="listPeriod">
                <sqf:description>
                    <sqf:title>Remove end period</sqf:title>
                </sqf:description>
                <sqf:stringReplace regex="(\w+)\.\s*$" match="(.//text())[last()]" select="'$1'"/>
            </sqf:fix>
        </sch:rule>
    </sch:pattern>
    

    Note: You can skip the use-when as the fix appears anyway only if the test failed.