javajava-8java-stream

Remove repeated sequence of elements from a List


I have a requirement where I would like to use the Java Stream API to process a stream of events from a system and apply a data cleanup process to remove repeated events. This is removing the same event repeated multiple times in sequence, not creating a list of distinct events. Most of the Java Stream API examples available online target creating a distinct output from a given input.

Example, for input stream

[a, b, c, a, a, a, a, d, d, d, c, c, e, e, e, e, e, e, f, f, f]

the output List or Stream should be

[a, b, c, a, d, c, e, f]

My current implementation (not using Stream API) looks like

public class Test {
    public static void main(String[] args) {
        String fileName = "src/main/resources/test.log";
        try {
            List<String> list = Files.readAllLines(Paths.get(fileName));
            LinkedList<String> acc = new LinkedList<>();

            for (String line: list) {
                if (acc.isEmpty())
                    acc.add(line);
                else if (! line.equals(acc.getLast()) )
                    acc.add(line);
            }

            System.out.println(list);
            System.out.println(acc);

        } catch (IOException ioe) {
            ioe.printStackTrace();
        }
    }
}

Output,

[a, b, c, a, a, a, a, d, d, d, c, c, e, e, e, e, e, e, f, f, f]
[a, b, c, a, d, c, e, f]

I've tried various example with reduce, groupingBy, etc., without success. I can't seem to find a way to compare a stream with the last element in my accumulator, if there is such a possibility.


Solution

  • You can use IntStream to get hold of the index positions in the List and use this to your advantage as follows :

    List<String> acc = IntStream
                .range(0, list.size())
                .filter(i -> ((i < list.size() - 1 && !list.get(i).equals(list
                        .get(i + 1))) || i == list.size() - 1))
                .mapToObj(i -> list.get(i)).collect(Collectors.toList());
    System.out.println(acc);
    

    Explanation

    1. IntStream.range(0,list.size()) : Returns a sequence of primitive int-valued elements which will be used as the index positions to access the list.
    2. filter(i -> ((i < list.size() - 1 && !list.get(i).equals(list.get(i + 1) || i == list.size() - 1)) : Proceed only if the element at current index position is not equal to the element at the next index position or if the last index position is reached
    3. mapToObj(i -> list.get(i) : Convert the stream to a Stream<String>.
    4. collect(Collectors.toList()) : Collect the results in a List.