javagroovyinputstreamaws-java-sdk

writeTo PipedOutputStream just hangs


My goal is to:

  1. reading a file from S3,
  2. changing its metadata
  3. Push it out to S3 again

AWS java SDK doesn't allow outputstreams to be pushed. Therefore, I have to convert the outputstream from step2 to inputstream. For this I decided to use PipedInputStream.

However, my code just hangs in the writeTo(out); step. This code is in a grails application. When the code hangs the CPU is not in high consumption:

import org.apache.commons.imaging.formats.jpeg.xmp.JpegXmpRewriter;

AmazonS3Client client = nfile.getS3Client() //get S3 client
S3Object object1 = client.getObject(
                  new GetObjectRequest("test-bucket", "myfile.jpg")) //get the object. 

InputStream isNew1 = object1.getObjectContent(); //create input stream
ByteArrayOutputStream os = new ByteArrayOutputStream();
PipedInputStream inpipe = new PipedInputStream();
final PipedOutputStream out = new PipedOutputStream(inpipe);

try {
   String xmpXml = "<x:xmpmeta>" +
    "\n<Lifeshare>" +
    "\n\t<Date>"+"some date"+"</Date>" +
    "\n</Lifeshare>" +
    "\n</x:xmpmeta>";/
   JpegXmpRewriter rewriter = new JpegXmpRewriter();
   rewriter.updateXmpXml(isNew1,os, xmpXml); //This is step2

   try {
new Thread(new Runnable() {
    public void run () {
        try {
            // write the original OutputStream to the PipedOutputStream
            println "starting writeto"
            os.writeTo(out);
            println "ending writeto"
        } catch (IOException e) {
            // logging and exception handling should go here
        }
    }
}).start();

         ObjectMetadata metadata = new ObjectMetadata();
         metadata.setContentLength(1024); //just testing
         client.putObject(new PutObjectRequest("test-bucket", "myfile_copy.jpg", inpipe, metadata));
         os.writeTo(out);

         os.close();
         out.close();
   } catch (IOException e) {
         // logging and exception handling should go here
   }

}
finally {
   isNew1.close()
   os.close()
   out.close()
}

The above code just prints starting writeto and hangs. it does not print ending writeto

Update By putting the writeTo in a separate thread, the file is now being written to S3, however, only 1024 bytes of it being are written. The file is incomplete. How can I write everything from outputstream to S3?


Solution

  • When you do os.writeTo(out), it will try to flush an entire stream to out, and since there is nobody reading from the other side of it (i.e. inpipe) yet, the internal buffer fills up and the thread stops.

    You have to setup the reader before you write the data, and also make sure that it is executed in a separate thread (see javadoc on PipedOutputStream).