xpathschematron

SchemaTron rule to find invalid records


I am trying to validate the following XML using the Schematron rule.

XML:

<?xml version="1.0" encoding="utf-8"?>
<Biotic><Maul><Number>1</Number>
 <Record><Code IDREF="a1"/>
   <Detail><ItemID>1</ItemID></Detail>
   <Detail><ItemID>3</ItemID></Detail>
 </Record>
 <Record><Code IDREF="b1"/>
   <Detail><ItemID>3</ItemID></Detail>
   <Detail><ItemID>4</ItemID></Detail>
 </Record>
 <Record><Code IDREF="b1"/>
   <Detail><ItemID>4</ItemID></Detail>
   <Detail><ItemID>6</ItemID></Detail>
 </Record>
 <Record><Code IDREF="c1"/>
   <Detail><ItemID>5</ItemID></Detail>
   <Detail><ItemID>5</ItemID></Detail>
 </Record>
</Maul></Biotic>

And the check is "ItemID should be unique for the given Code within the given Maul."

So as per requirement Records with Code b1 is not valid because ItemId 4 exists in both records.

Similarly, record C1 is also not valid because c1 have two nodes with itemId 5.

Record a1 is valid, even ItemID 3 exists in the next record but the code is different.

Schematron rule I tried:

<?xml version="1.0" encoding="utf-8" ?><schema xmlns="http://purl.oclc.org/dsdl/schematron" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<title>Schematron validation rule</title>
<pattern id="P1">
  <rule context="Maul/Record" id="R1">
   <let name="a" value="//Detail/[./ItemID, ../Code/@IDREF]"/>
   <let name="b" value="current()/Detail/[./ItemID, ../Code/@IDREF]"/>
   <assert test="count($a[. = $b]) = count($b)">              
    ItemID should be unique for the given Code within the given Maul.
   </assert>
 </rule>
</pattern>
</schema>

Solution

  • The two let values seem problematic. They will each return a Detail element (and all of its content including attributes, child elements, and text nodes). I'm not sure what the code inside the predicates [./ItemID, ../Code/@IDREF] is going to, but I think it will return all Detail elements that have either a child ItemID element or a sibling Code element with an @IDREF attribute, regardless of what the values of ItemID or @IDREF are.

    I think I would change the rule/@context to ItemID, so the assert would fail once for each ItemID that violates the constraint.

    Here are a rule and assert that work correctly:

    <?xml version="1.0" encoding="utf-8" ?><schema xmlns="http://purl.oclc.org/dsdl/schematron" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <title>Schematron validation rule</title>
    <pattern id="P1">
      <rule context="Maul/Record/Detail/ItemID" id="R1">
       <assert test="count(ancestor::Maul/Record[Code/@IDREF = current()/ancestor::Record/Code/@IDREF]/Detail/ItemID[. = current()]) = 1">
        ItemID should be unique for the given Code within the given Maul.
       </assert>
     </rule>
    </pattern>
    </schema>
    

    The assert test finds, within the ancestor Maul, any Record that has a Code/@IDREF that equals the Code/@IDREF of the Record that the current ItemID is in. At minimum, it will find one Record (the one that the current ItemID is in). Then it looks for any Detail/ItemID within those Records that is equal to the current ItemID. It will find at least one (the current ItemID). The count function counts how many ItemIDs are found. If more than one is found, the assert fails.

    Thanks for the reference to https://www.liquid-technologies.com/online-schematron-validator! I wasn't aware of that tool.