javaxsltapache-fopxalanxxe

TransformerFactory with FOP : Issue when disabling ACCESS_EXTERNAL_DTD


I'm trying to secure a piece of code from XXE attack. The code uses FOP library and mimeFormat is application/pdf.

The original code works well:

    protected static void transformTo(Result result, Source src, String mimeFormat, String sFileNameXsl)
            throws FOPException {
        try {
            TransformerFactory factory = TransformerFactory.newInstance();

            File myXslFile = new File(sFileNameXsl);
            StreamSource xsltSource = new StreamSource(myXslFile);
            Transformer transformer = factory.newTransformer(xsltSource);
            transformer.setParameter("fop-output-format", mimeFormat);
            transformer.transform(src, result);
        } catch (Exception e) {
            throw new FOPException(e);
        }
    }

The application uses the Apache implementation from xalan-2.7.2 org.apache.xalan.processor.TransformerFactoryImpl.

As I tried to disable external DTD and stylesheet, I was forced to switch implemenation due to error "not supported property accessExternalDTD".

So I changed the code to use JDK8 implementation com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl :

TransformerFactory factory = TransformerFactory.newInstance("com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl", ClassLoader.getSystemClassLoader());
factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");
factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_STYLESHEET, "");

At this point, the properties were supported, but a new message appeared:

FATAL ERROR: Cannot convert data type 'int' in 'node-set'.

The message is unexpected since I didn't change any structure or processing, just tried to secure the transformer with some properties.

The xslt is a modified version of this one https://www.antennahouse.com/hubfs/uploads/XSL%20Sample/xhtml2fo.xsl?hsLang=en

The error on data types is due to a misuse of | operator such as:

<xsl:variable name="numcolumns" select="count(./html:tr/*)|count(./html:TR/*)"/>

<xsl:if test="ancestor::html:table[1]/@rules = 'cols'|ancestor::html:TABLE[1]/@rules = 'cols'">

I managed to make things work by giving up the uppercase matchers:

<xsl:variable name="numcolumns" select="count(./html:tr/*)"/>

<xsl:if test="ancestor::html:table[1]/@rules = 'cols'">

So ultimately the question is, how do I match both uppercase and lowercase in these matchers ?


Solution

  • The issue is a tolerance of syntax for union expressions which exists in the Apache parser but not the JDK parser.

    The following union expressions :

    <xsl:variable name="numcolumns" select="count(./html:tr/*)|count(./html:TR/*)"/>
    <xsl:if test="ancestor::html:table[1]/@rules = 'cols'|ancestor::html:TABLE[1]/@rules = 'cols'">
    

    had to be rewritten to :

    <xsl:variable name="numcolumns" select="count(./html:tr/*|./html:TR/*)"/>
    <xsl:if test="(ancestor::html:table|ancestor::html:TABLE)[1]/@rules = 'cols'">