i have a huge List<String[]> like about 500k elements validation of it takes too long - 35-40 sec validation looks like this
Iterator<String[]>iterator=parser.iterate(request.getInputStream()).iterator();
List<String[]> list =new ArrayList<>();
List<NotValidRow>badList=new ArrayList<>();
while (iterator.hasNext()){
var tmp=iterator.next();
if(tmp.length!=2)continue;
if (tmp[0] == null || !SKIP_PATTERN.matcher(tmp[0]).matches()) {
badList.add(new NotValidRow(tmp[0], tmp[1], NotValidRowReason.NOT_VALID_EMAIL));
}
if(tmp[1]==null || tmp[1].isBlank()){
badList.add(new NotValidRow(tmp[0],tmp[1],NotValidRowReason.EMPTY_NAME));
}
list.add(tmp);
}
i think its possible to do it faster with fork join pool but i dnt know how to do it, could you guys help me wtih that
You can use Stream
parallel processing, however, you'll have to sneak out the bad list in a thread-safe manner: for example:
var spliterator = Spliterators.spliteratorUnknownSize(iterator, 0);
var badQueue = new ConcurrentLinkedQueue<NotValidRow>();
List<String[]> list = StreamSupport.stream(spliterator, true)
.filter(tmp -> {
if (tmp.length != 2) {
return false;
}
if (tmp[0] == null || !SKIP_PATTERN.matcher(tmp[0]).matches()) {
badQueue.offer(new NotValidRow(tmp[0], tmp[1], NotValidRowReason.NOT_VALID_EMAIL));
return false;
}
if (tmp[1] == null || tmp[1].isBlank()){
badQueue.offer(new NotValidRow(tmp[0], tmp[1], NotValidRowReason.EMPTY_NAME));
return false;
}
return true;
})
.toList();
List<NotValidRow> badList = new ArrayList<>(badQueue);
Edit Apparently, the OP didn't mean to include the bad entries in the good list, so I've updated the answer to filter out the bad entries.