xslt-1.0

Remove nodes on missing child reference


I have (thanks to SO!!) already some XSLT to "filter and remove / xsl:copy" using clearly defined conditions, but now I'm facing a kind of "Wildcards / search everything on everywhere", and I want to ask on an abstract example for the basic way to start.

Fake Example: I have hundreds of factories, producing thousands of products. The unmodified XML looks like this:

<data ID="1">
    <type>factory.tires</type>
    <Country>a</Country>
</data>

<data ID="2">
    <type>factory.glass</type>
    <Country>a</Country>
</data>

<data ID="3">
    <type>factory.steel</type>
    <Country>b</Country>
</data>

<data ID="4">
    <type>factory.tires</type>
    <Country>c</Country>
</data>

<!---- xxxxxxxxxxxxxx -->

<data ID="11">
    <type>product.screws</type>
    <factory REF="3" />
</data>

<data ID="12">
    <type>product.screws</type>
    <factory REF="3" />
</data>

<data ID="13">
    <type>product.cups</type>
    <factory REF="2" />
</data>

<data ID="20">
    <type>product.wheel</type>
    <factory REF="1" />
</data>

<data ID="21">
    <type>product.wheel</type>
    <factory REF="4" />
</data>

Now we closed factories 1 and 2, the data of the factories is already removed:

<data ID="3">
    <type>factory.steel</type>
    <Country>b</Country>
</data>

<data ID="4">
    <type>factory.tires</type>
    <Country>c</Country>
</data>

<!---- xxxxxxxxxxxxxx -->

<data ID="11">
    <type>product.screws</type>
    <factory REF="3" />
</data>

<data ID="12">
    <type>product.screws</type>
    <factory REF="3" />
</data>

<data ID="13">
    <type>product.cups</type>
    <factory REF="2" />
</data>

<data ID="20">
    <type>product.wheel</type>
    <factory REF="1" />
</data>

<data ID="21">
    <type>product.wheel</type>
    <factory REF="4" />
</data>

Challenge: All products which where made in the closed factories have to be removed. But I don't know which factories are closed - dozens ... So the idea is "

This should be the result - IDs 13 and 20 are removed:

<data ID="3">
    <type>factory.steel</type>
    <Country>b</Country>
</data>

<data ID="4">
    <type>factory.tires</type>
    <Country>c</Country>
</data>

<!---- xxxxxxxxxxxxxx -->

<data ID="11">
    <type>product.screws</type>
    <factory REF="3" />
</data>

<data ID="12">
    <type>product.screws</type>
    <factory REF="3" />
</data>


<data ID="21">
    <type>product.wheel</type>
    <factory REF="4" />
</data>

Any help will be appreciated very much! Thanks in advance.


Solution

  • If I understand your question correctly, you want to do:

    XSLT 1.0

    <xsl:stylesheet version="1.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
    <xsl:strip-space elements="*"/>
    
    <xsl:key name="data-by-ID" match="data" use="@ID" />
    
    <!-- identity transform -->
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>
    
    <!-- remove products that do not have a related factory -->
    <xsl:template match="data[factory][not(key('data-by-ID', factory/@REF))]"/>
    
    </xsl:stylesheet>
    

    This is assuming that every data element that represents a product has a factory child.

    If every data element that represents a factory has a Country child, you could make this more efficient by defining the key as:

    <xsl:key name="data-by-ID" match="data[Country]" use="@ID" />