I am getting this error:
io.MalformedByteSequenceException: Invalid byte 2 of 2-byte UTF-8 sequence
The solution is to read and write a file in UTF-8.
My code is:
InputStream input = null;
OutputStream output = null;
OutputStreamWriter bufferedWriter = new OutputStreamWriter( output, "UTF8");
input = new URL(url).openStream();
output = new FileOutputStream("DirectionResponse.xml");
byte[] buffer = new byte[1024];
for (int length = 0; (length = input.read(buffer)) > 0;) {
output.write(buffer, 0, length);
}
BufferedReader br = new BufferedReader(new FileReader("DirectionResponse.xml" ));
FileWriter fstream = new FileWriter("ppre_DirectionResponse.xml");
BufferedWriter out = new BufferedWriter(fstream);
I'm reading a url and writing it to a file DirectionResponse.xml. Then reading DirectionResponse.xml and writing the same as ppre_DirecionResponse.xml for processing.
How do I change this so that reading and writing is done in UTF-8?
First, you need to call output.close()
(or at least call output.flush()
) before you reopen the file for input. That's probably the main cause of your problems.
Then, you shouldn't use FileReader
or FileWriter
for this because it always uses the platform-default encoding (which is often not UTF-8). From the docs for FileReader
:
The constructors of this class assume that the default character encoding and the default byte-buffer size are appropriate.
You have the same problem when using a FileWriter
. Replace this:
BufferedReader br = new BufferedReader(new FileReader("DirectionResponse.xml" ));
with something like this:
BufferedReader br = new BufferedReader(new InputStreamReader(
new FileInputStream("DirectionResponse.xml"), "UTF-8"));
and similarly for fstream
.