stack-overflowdenial-of-servicecve

Why is a StackOverflowError worth a CVE?


Recently, vulnerability reports are accumulating against (Java) libraries that complain that the library offers a recursive function that may exhaust the available stack depth and cause a StackOverflowError on "malicious" input. The newest example is CVE-2023-1370 complaining that a JSON parser may cause a StackOverflowError if it is requested to parse a nested object structure with a depth higher than the available stack size.

The vulnerability report claims that this behavior offers the possibility of a denial-of-service attack against software using this library. Isn't a StackOverflowError a regular exception that aborts the execution of the running request? When running e.g. in a web container, the request that caused the StackOverflowError is aborted, the failure is signaled with an error code - normally a HTTP/500 in case of such "internal" error and the thread that caused the error continues to serve the next request.

The "fix" for CVE-2023-1370 instead stops execution after reaching a fixed depth of 400 with a ParseException. Depending on the stack size and the size of the parser's stack frame size, this stops execution somewhat earlier, but the effect is more or less the same. The request with the "malicious" input is terminated, the error is signaled (here with a maybe somewhat "better" error message) and operation continues.

So why is it worth filing vulnerability reports against libraries with (almost all) recursive functions?


Solution

  • On a common sense level: any code using this library can reasonably be expected to have handling for a ParseException, but not a StackOverflowError, so this problem is going to make the code crash harder and give it less chance of recovering.

    On a formalistic level:

    Why is it a vulnerability: https://www.cve.org/ResourcesSupport/AllResources/CNARules#section_7_assignment_rules

    Why is it high: https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator

    You can experiment with the metrics, I think network-exploitable, low-complexity, availability impact will do it.