I am trying to figureout how to delete ^M characters from a text file that is generated from Java code with following code.
public StringBuilder toCsv(Table table) {
StringBuilder stringBuilder = new StringBuilder();
String csv = new String();
for (Column cName : table.getColumns()){
csv += QUOT;
csv += cName.getName();
csv += QUOT;
csv += CSV_SEPERATOR;
}
csv += "\n";
stringBuilder.append(csv);
for (Row row : table) {
Collection<Object> values = row.values();
String csvString = "";
if (values.size() == 10) {
String ep = QUOT + CSV_SEPERATOR + QUOT ;
csvString = StringUtils.join(row.values(),ep );
csvString.replaceAll("\'", "");
csvString = QUOT + csvString + QUOT;
logger.info("line ++++ " + csvString);
}
stringBuilder.append(csvString);
stringBuilder.append("\n");
}
return stringBuilder;
}
then I am using following method to write the data to file
public void writeCsv(String data, String path, String fileName) throws IOException {
String completePath = path + "/" + fileName;
Writer out = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream(completePath)));
try {
out.write(data);
} finally {
out.close();
}
}
Context
I am generating CSV files using http://jackcess.sourceforge.net/ from a Microsoft access (.mdb) file. When I generate csv and open using vim, I see lots of ^M in the middle of lines. NOTE: I am on MacOS
I have tried following to remove ^M (which I believe is a MS Windows CARRIAGE_RETURN) before writing to csv
csvLine.replaceAll("\n\r", "");
AND
csvLine.replaceAll("\r\n", "");
AND
csvLine.replaceAll("\\r", "");
Generated CSV
'10773.0';'';'';'';'Thu Jul 14 00:00:00 CEST 2016';'By Cash';'';'10000.0';'';'2102.0'
'10001.0';'';'';'';'Thu Jul 14 00:00:00 CEST 2016';'Pet Soup cash';'087470^M
^M
^M
087470';'-45000.0';'';'2102.0'
'10360.0';'';'';'';'Thu Jul 14 00:00:00 CEST 2016';'By Cash';'';'37000.0';'';'2101.0'
'10444.0';'';'';'';'Thu Jul 14 00:00:00 CEST 2016';'By Cash';'';'2000.0';'';'2101.0'
As you can see one line in above CSV is broken with ^M which is not desired. I need to programmatically remove such characters from the file.
Expected output after removing ^M and proceeding line
'10001.0';'';'';'';'Thu Jul 14 00:00:00 CEST 2016';'Pet Soup cash';'087470087470';'-45000.0';'';'2102.0'
Any help will be appreciated.
Strings are immutable, so the .replaceAll
method does not change the value of the existing String; it performs the replacement and returns a new String value. So,
String csvString = "123,foo,234";
csvString.replaceAll("foo", "");
System.out.println(csvString);
prints
123,foo,234
showing that the string is unchanged. What you want to do is
String csvString = "123,foo,234";
csvString = csvString.replaceAll("foo", ""); // save the new value
System.out.println(csvString);
which prints
123,,234
In your particular case, it looks like you want to do
csvString = csvString.replaceAll("\r\n", ""); // save the new value
since you want to remove both the carriage_return (which appears as ^M
) and the new_line (which starts a new line in the text file).