javafileserializationfile-iotext-files

Proper Java classes for reading and writing files?


Reading some sources about Java file I/O managing, I get to know that there are more than 1 alternative for input and output operations.

These are:

What of these is best alternative for text files managing? What's best alternative for serialization? What does Java NIO say about it?


Solution

  • Two kinds of data

    Generally speaking there are two "worlds":

    When it's a file (or a socket, or a BLOB in a DB, or ...), then it's always binary data first.

    Some of that binary data can be treated as text data (which involves something called an "encoding" or "character encoding").

    Binary Data

    Whenever you want to handle the binary data then you need to use the InputStream/OutputStream classes (generally, everything that contains Stream in its name).

    That's why there's a FileInputStream and a FileOutputStream: those read from and write to files and they handle binary data.

    Text Data

    Whenever you want to handle text data, then you need to use the Reader/Writer classes.

    Whenever you need to convert binary data to text (or vice versa), then you need some kind of encoding (common ones are UTF-8, UTF-16, ISO-8859-1 (and related ones) and the good old US-ASCII). "Luckily" the Java platform also has something called the "default platform encoding" which it will use whenever it needs one but the code doesn't specify one.

    The platform default encoding is a two-sided sword, however:

    For reading, we should also mention the BufferedReader which can be wrapped around any other Reader and adds the ability to handle whole lines at once.

    Scanner is a special class that's meant to parse text input into tokens. It's most useful for structured text but often used on System.in to provide a very simple way to read data from stdin (i.e. from what the user inputs on the keyboard).

    Bridgin the gap

    Now, confusingly enough there are classes that make the bridge between those worlds, which generally have both parts in their names:

    And then there are "shortcut classes" that basically combine two other classes that are often combined.

    Note that FileReader and FileWriter used to have a major drawback compared to their more complicated "hand-built" alternative: they use the platform default encoding, which might not be what you're trying to do! In Java 11 they finally got two-argument constructors so they can still be used when wanting to provide an encoding.

    What about serialization?

    ObjectOutputStream and ObjectInputStream are special streams used for serialization.

    As the name of the classes implies serializing involves only binary data (even if serializing String objects), so you'll want to use *Stream classes exclusively. As long as you avoid any Reader/Writer classes, you should be fine.

    Further resources