I'm trying to secure a piece of code from XXE attack. The code uses FOP library and mimeFormat is application/pdf.
The original code works well:
protected static void transformTo(Result result, Source src, String mimeFormat, String sFileNameXsl)
throws FOPException {
try {
TransformerFactory factory = TransformerFactory.newInstance();
File myXslFile = new File(sFileNameXsl);
StreamSource xsltSource = new StreamSource(myXslFile);
Transformer transformer = factory.newTransformer(xsltSource);
transformer.setParameter("fop-output-format", mimeFormat);
transformer.transform(src, result);
} catch (Exception e) {
throw new FOPException(e);
}
}
The application uses the Apache implementation from xalan-2.7.2 org.apache.xalan.processor.TransformerFactoryImpl
.
As I tried to disable external DTD and stylesheet, I was forced to switch implemenation due to error "not supported property accessExternalDTD".
So I changed the code to use JDK8 implementation com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl
:
TransformerFactory factory = TransformerFactory.newInstance("com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl", ClassLoader.getSystemClassLoader());
factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");
factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_STYLESHEET, "");
At this point, the properties were supported, but a new message appeared:
FATAL ERROR: Cannot convert data type 'int' in 'node-set'.
The message is unexpected since I didn't change any structure or processing, just tried to secure the transformer with some properties.
The xslt is a modified version of this one https://www.antennahouse.com/hubfs/uploads/XSL%20Sample/xhtml2fo.xsl?hsLang=en
The error on data types is due to a misuse of | operator such as:
<xsl:variable name="numcolumns" select="count(./html:tr/*)|count(./html:TR/*)"/>
<xsl:if test="ancestor::html:table[1]/@rules = 'cols'|ancestor::html:TABLE[1]/@rules = 'cols'">
I managed to make things work by giving up the uppercase matchers:
<xsl:variable name="numcolumns" select="count(./html:tr/*)"/>
<xsl:if test="ancestor::html:table[1]/@rules = 'cols'">
So ultimately the question is, how do I match both uppercase and lowercase in these matchers ?
The issue is a tolerance of syntax for union expressions which exists in the Apache parser but not the JDK parser.
The following union expressions :
<xsl:variable name="numcolumns" select="count(./html:tr/*)|count(./html:TR/*)"/>
<xsl:if test="ancestor::html:table[1]/@rules = 'cols'|ancestor::html:TABLE[1]/@rules = 'cols'">
had to be rewritten to :
<xsl:variable name="numcolumns" select="count(./html:tr/*|./html:TR/*)"/>
<xsl:if test="(ancestor::html:table|ancestor::html:TABLE)[1]/@rules = 'cols'">