javaxml-parsingwoodstox

Configure max attribute size with woodstox


The woodstox-core-asl for some bizarre reason seems to have a limit on the maximum size of the attribute values to be 512KB. So the XML parsing fails with the error (524288 below is the 512KB limit):

com.ctc.wstx.exc.WstxParsingException: Maximum attribute size (524288) exceeded
 at [row,col {unknown-source}]: [1,898330]
    at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:606)
    at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:479)
    at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:464)
    at com.ctc.wstx.sr.BasicStreamReader.parseAttrValue(BasicStreamReader.java:1959)
    at com.ctc.wstx.sr.BasicStreamReader.handleNsAttrs(BasicStreamReader.java:3063)

Is there a way to configure this max attribute size or even completely disable it? (Why the attribute value has to be that large is a different argument, though -- and I have to see what I can do about it.)

I tried to look at the source code, but I only have limited access to it -- can't browse github at work. There aren't any pointers in their docs either.

The version of the library I'm using is 4.2.0. Upgrading the version is possible but even with the newer versions, they seem to have this constraint.


Solution

  • Yes, there is a way to change that. Error message really ought to mention it, but since it does not... let's see. Constants are defined in WstxInputProperties (for Woodstox-specific ones, not standard Stax ones) and property you need is P_MAX_ATTRIBUTE_SIZE. To effectively disable check, use value of Integer.MAX_VALUE. Value is change by calling XMLInputFactory.setProperty method.

    These limits were added to guard against various Denial-of-Service (DoS) attacks: there are a few; you can see ones available in WstxInputProperties. Settings are quite conservative and it may well make sense to see if you really need to process 512kB attribute values... :)