javajava-streamyearmonth

Java - Stream Filter to get Last Day of several months in a list of dates (not calendar last day)


Query: To filter the data below to find the last date of each month in the list. Note that in this context, the last date of month in the data may or may not match with the last date of the calendar month. The expected output is shown in second list.

Research:

I hope the issue is clear and points me in the direction on how this can be done with streams, as I don't want to use a for loop.

Sample Data:

Date Model Start End
27-11-1995 ABC 241 621
27-11-1995 XYZ 3456 7878
28-11-1995 ABC 242 624
28-11-1995 XYZ 3457 7879
29-11-1995 ABC 243 627
29-11-1995 XYZ 3458 7880
30-11-1995 ABC 244 630
30-11-1995 XYZ 3459 7881
01-12-1995 ABC 245 633
01-12-1995 XYZ 3460 7882
04-12-1995 ABC 246 636
04-12-1995 XYZ 3461 7883
27-12-1995 ABC 247 639
27-12-1995 XYZ 3462 7884
28-12-1995 ABC 248 642
28-12-1995 XYZ 3463 7885
29-12-1995 ABC 249 645
29-12-1995 XYZ 3464 7886
01-01-1996 ABC 250 648
01-01-1996 XYZ 3465 7887
02-01-1996 ABC 251 651
02-01-1996 XYZ 3466 7888
29-01-1996 ABC 252 654
29-01-1996 XYZ 3467 7889
30-01-1996 ABC 253 657
30-01-1996 XYZ 3468 7890
31-01-1996 ABC 254 660
31-01-1996 XYZ 3469 7891

Screenshot

Output required:

Date Model Start End
30-11-1995 ABC 244 630
30-11-1995 XYZ 3459 7881
29-12-1995 ABC 249 645
29-12-1995 XYZ 3464 7886
31-01-1996 ABC 254 660
31-01-1996 XYZ 3469 7891

Screenshot


Solution

  • Well, a combination of groupingBy and maxBy will probably do.

    I assume each record of the table to be of type Event:

    record Event(LocalDate date, String model, int start, int end) { }
    

    To get the last days of the month which are within the table, we could utilize groupingBy. In order to group this, we could first create a grouping type. Below, I created an EventGrouping record1, with a static method to convert an Event to an EventGrouping. Your desired output suggests that you want to group by each year-month-model combination, so we just picked those two properties:

    public record EventGrouping(YearMonth yearMonth, String model) {
            
        public static EventGrouping fromEvent(Event event) {
            return new EventGrouping(YearMonth.from(event.date()), event.model());
        }
    }
    

    Then, we could get our desired result like this:

    events.stream()
        .collect(Collectors.groupingBy(
            EventGrouping::fromEvent,
            Collectors.maxBy(Comparator.comparing(Event::date))
        ));
    

    What happens here is that all stream elements are grouped by our EventGrouping, and then the "maximum value" of each of the event groups is picked. The maximum value is, of course, the most recent date of that certain month.

    Note that maxBy returns an Optional, for the case when a group is empty. Also note that the resulting Map is unordered.

    We could fix both of these issues by using collectingAndThen and a map factory respectively:

    Map<EventGrouping, Event> map = events.stream()
        .collect(groupingBy(
            EventGrouping::fromEvent,
            () -> new TreeMap<>(Comparator.comparing(EventGrouping::yearMonth)
                .thenComparing(EventGrouping::model)),
            collectingAndThen(maxBy(Comparator.comparing(Event::date)), Optional::get)
        ));
    

    Note: groupingBy, collectingAndThen and maxBy are all static imports from java.util.stream.Collectors.


    1 Instead of writing a custom type, you could also use an existing class holding two arbitrary values, such as a Map.Entry, a Pair or even a List<Object>.