javaperformancebufferedinputstreambufferedoutputstream

Speed on file reading / writing by BufferedInputStream / BufferedOutputStream


Got two questions.

  1. What does actually the program do if coded bis.read() instead of bis.read(bys)? (It works at any rate though much slower.)

  2. Why is bos.write(bys) much faster than bos.write(bys, 0, len)? (I expected that both are same in speed in run.)

Thanks!

public class CopyFileBfdBytes {

    public static void main(String[] args) throws IOException {

        FileInputStream fis = new FileInputStream("d:/Test1/M1.MP3");
        BufferedInputStream bis = new BufferedInputStream(fis);

        FileOutputStream fos = new FileOutputStream("d:/Test2/M2.mp3");
        BufferedOutputStream bos = new BufferedOutputStream(fos);

        byte[] bys = new byte[8192];
        int len;
        while ((len = bis.read(bys)) != -1){
//        while ((len = bis.read()) != -1){  // 1. Why does it still work when bys in bis.read() is missing?
            bos.write(bys);
//            bos.write(bys, 0, len);     // 2. Why is this slower than bos.write(bys)?
            bos.flush();
        }
        fis.close();
        bis.close();
        fos.close();
        bos.close();
    }
}

Solution

  • First of all, it seems like you simply want to copy a file as-is. There are much simpler (and maybe even more performant approaches) to do this.

    Other approaches to copy data

    Copying files

    If everything you need is to copy actual Files like in your example, you could simply use:

    package example;
    
    import java.io.IOException;
    import java.nio.file.Files;
    import java.nio.file.Paths;
    
    public class SO66024231 {
    
        public static void main(String[] args) throws IOException {
            Files.copy(Paths.get("d:/Test1/M1.MP3"), Paths.get("d:/Test2/M2.mp3"));
        }
    }
    

    This is most likely the most descibring (e.g. other devs actually see what you want to do) and can be done very efficient by the underlyingsystem.

    Copying data from any source to any destination (InputStream to OutputStream)

    If you need to transfer data from any InputStream to any OutputStream you could use the method InputStream#transferTo(OutputStream):

    package example;
    
    import java.io.*;
    
    public class SO66024231 {
    
        public static void main(String[] args) throws IOException {
            try (InputStream fis = new FileInputStream("d:/Test1/M1.MP3")) {
                try (OutputStream fos = new FileOutputStream("d:/Test2/M2.mp3")) {
                    fis.transferTo(fos);
                }
            }
        }
    }
    
    

    Describing your question in-depth

    Note: I will talk about InputStreams and OutputStreams in general. You used BufferedInputStream and BufferedOutputStream. Those are specific implementations that internally buffer data. This internal buffering has nothing to do with the buffering I will talk about next!

    InputStream

    There is a fundamental difference between InputStream#read() and InputStream#read(byte[]).

    InputStream#read() reads one byte from the InputStream and returns it. The returned value is an int in the range of 0-255 or -1 if the Stream is exhausted (there is no more data).

    package example;
    
    import java.io.*;
    
    public class SO66024231 {
    
        public static void main(String[] args) throws IOException {
            final byte[] myBytes = new byte[]{-1, 0, 3, 4, 5, 6, 7, 8, 127};
            printAllBytes(new ByteArrayInputStream(myBytes));
        }
    
        public static void printAllBytes(InputStream in) throws IOException {
            int currByte;
            while ((currByte = in.read()) != -1) {
                System.out.println((byte) currByte);// note the cast to byte!
            }
            
            // prints: -1, 0, 3, 4, 5, 6, 7, 8, 127
        }
    }
    

    InputStream#read(byte[]) however, is completely different. It takes an byte[] as a parameter that is being used as a buffer. It then (internally) tries to fill the given buffer with as much bytes as it can obtain at the moment and returns the actual number of bytes that it has filled or -1 if the Stream is exhausted.

    Example:

    package example;
    
    import java.io.*;
    
    public class SO66024231 {
    
        public static void main(String[] args) throws IOException {
            final byte[] myBytes = new byte[]{-1, 0, 3, 4, 5, 6, 7, 8, 127};
            printAllBytes(new ByteArrayInputStream(myBytes));
        }
    
        public static void printAllBytes(InputStream in) throws IOException {
            final byte[] buffer = new byte[2];// do not use this small buffer size. This is just for the example
            int bytesRead;
    
            while ((bytesRead = in.read(buffer)) != -1) {
                // loop from 0 to bytesRead, !NOT! to buffer.length!!!
                for (int i = 0; i < bytesRead; i++) {
                    System.out.println(buffer[i]);
                }
            }
    
            // prints: -1, 0, 3, 4, 5, 6, 7, 8, 127
        }
    }
    
    

    Bad example: Now a bad example. The following code contains programming-errors, so don't use this!

    We now loop from 0 to buffer.length, but our input-data contains exactly 9 bytes. That means, in the last iteration our buffer will only be filled with one byte. The second byte in our buffer will not be touched.

    package example;
    
    import java.io.*;
    
    public class SO66024231 {
    
        /**
         * ERROURNOUS EXAMPLE!!! DO NOT USE
         */
        public static void main(String[] args) throws IOException {
            final byte[] myBytes = new byte[]{-1, 0, 3, 4, 5, 6, 7, 8, 127};
            printAllBytes(new ByteArrayInputStream(myBytes));
        }
    
        public static void printAllBytes(InputStream in) throws IOException {
            final byte[] buffer = new byte[2];// do not use this small buffer size. This is just for the example
            int bytesRead;
    
            while ((bytesRead = in.read(buffer)) != -1) {
                for (int i = 0; i < buffer.length; i++) {
                    System.out.println(buffer[i]);
                }
            }
    
            // prints: -1, 0, 3, 4, 5, 6, 7, 8, 127, 8 <-- see; the 8 is printed because we ignored the bytesRead value in our for loop; the 8 is still in our buffer from the previous iteration
        }
    }
    
    

    OutputStream

    Now that I described what the differences in reading are, I'll describe you the differences in writing.

    First, the correct example (using OutputStream.write(byte[], int, int)):

    package example;
    
    import java.io.*;
    import java.util.Arrays;
    
    public class SO66024231 {
    
        public static void main(String[] args) throws IOException {
            final byte[] myBytes = new byte[]{-1, 0, 3, 4, 5, 6, 7, 8, 127};
            final byte[] copied = copyAllBytes(new ByteArrayInputStream(myBytes));
    
            System.out.println(Arrays.toString(copied));// prints: [-1, 0, 3, 4, 5, 6, 7, 8, 127]
        }
    
        public static byte[] copyAllBytes(InputStream in) throws IOException {
            final ByteArrayOutputStream bos = new ByteArrayOutputStream();
            final byte[] buffer = new byte[2];
            int bytesRead;
    
            while ((bytesRead = in.read(buffer)) != -1) {
                bos.write(buffer, 0, bytesRead);
            }
    
            return bos.toByteArray();
        }
    }
    
    

    And the bad example:

    package example;
    
    import java.io.*;
    import java.util.Arrays;
    
    public class SO66024231 {
    
        /*
        ERRORNOUS EXAMPLE!!!!
         */
        public static void main(String[] args) throws IOException {
            final byte[] myBytes = new byte[]{-1, 0, 3, 4, 5, 6, 7, 8, 127};
            final byte[] copied = copyAllBytes(new ByteArrayInputStream(myBytes));
    
            System.out.println(Arrays.toString(copied));// prints: [-1, 0, 3, 4, 5, 6, 7, 8, 127, 8] <-- see; the 8 is here again
        }
    
        public static byte[] copyAllBytes(InputStream in) throws IOException {
            final ByteArrayOutputStream bos = new ByteArrayOutputStream();
            final byte[] buffer = new byte[2];
            int bytesRead;
    
            while ((bytesRead = in.read(buffer)) != -1) {
                bos.write(buffer);
            }
    
            return bos.toByteArray();
        }
    }
    

    This is because, just like in our examples with the InputStream, if we ignore the bytesRead, we will write one value to our OutputStream that we don't want: the byte 8 from the previous iteration. This is because internally, OutputStream#write(byte[]) is (in most implementations) just a shortcut for OutputStream.write(buffer, 0, buffer.length). That means it writes to whole buffer to the OutputStream.