I am new to CSV Parsing. I have a CSV file where the 3rd column (a description field) may have one or more 6 digit numbers along with other values. I need to filter out those numbers and write them in the adjacent column corresponding to each row.
Eg:
3rd column 4th column
============= ===========
123456adjfghviu77 123456
shgdasd234567 234567
123456abc:de234567:c567890d 123456-234567-567890
12654352474
Please help. This is what I have done so far.
String strFile="D:/Input.csv";
CSVReader reader=new CSVReader(new FileReader(strFile));
String[] nextline;
//int lineNumber=0;
String str="^[\\d|\\s]{5}$";
String regex="[^\\d]+";
FileWriter fw = new FileWriter("D:/Output.csv");
PrintWriter pw = new PrintWriter(fw);
while((nextline=reader.readNext())!=null){
//lineNumber++;
//System.out.println("Line : "+lineNumber);
if(nextline[2].toString().matches(str)){
pw.print(nextline[1]);
pw.append('\n');
System.out.println(nextline[2]);
}
}
pw.flush();
I suggest just matching 6-digit chunks, and build a new string when collecting matches:
String s = "123456abc:de234567:c567890d";
StringBuilder result = new StringBuilder();
Pattern pattern = Pattern.compile("(?<!\\d)\\d{6}(?!\\d)"); // Pattern to match 6 digit chunks not enclosed with digits
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
if (result.length() == 0) { // If the result is empty
result.append(matcher.group(0)); // add the 6 digit chunk
} else {
result.append("-").append(matcher.group(0)); // else add a delimiter and the digits after it
}
}
System.out.println(result.toString()); // Demo, use this to write to your new column
See the Java demo
UPDATE: I have changed the pattern from "\\d{6}"
to "(?<!\\d)\\d{6}(?!\\d)"
to make sure we only match 6-digit chunks that are not enclosed with other digits.
See the regex demo