javaioobject-design

Justification for the design of the public interface of ByteArrayOutputStream?


There are many java standard and 3rd party libraries that in their public API, there are methods for writing to or reading from Stream. One example is javax.imageio.ImageIO.write() that takes OutputStream to write the content of a processed image to it. Another example is iText pdf processing library that takes OutputStream to write the resulting pdf to it. Third example is AmazonS3 Java API, which takes InputStream so that will read it and create file in thir S3 storage.

The problem araises when you want to to combine two of these. For example, I have an image as BufferedImage for which i have to use ImageIO.write to push the result in OutputStream. But there is no direct way to push it to Amazon S3, as S3 requires InputStream.
There are few ways to work this out, but subject of this question is usage of ByteArrayOutputStream.

The idea behind ByteArrayOutputStream is to use an intermidiate byte array wrapped in Input/Output Stream so that the guy that wants to write to output stream will write to the array and the guy that wants to read, will read the array.

My wondering is why ByteArrayOutputStream does not allow any access to the byte array without copying it, for example, to provide an InputStream that has direct access to it. The only way to access it is to call toByteArray(), that will make a copy of the internal array (the standard one). Which means, in my image example, i will have three copies of the image in the memory:

How this design is justified?

Moreover, there is second flavor of ByteArrayOutputStream, provided by Apache's commons-io library (which has a different internal implementation). But both have exactly the same public interface that does not provide way to access the byte array without copying it.


Solution

  • Luckily, the internal array is protected, so you can subclass it, and wrap a ByteArrayInputStream around it, without any copying.