javajava-streamdeclarative-programminggroupingby

Java Stream Grouping by multiple fields individually in declarative way in single loop


I googled for it but I mostly found cases for grouping by aggregated fields or on to alter response of stream but not the scenario below:

I have a class User with fields category and marketingChannel.

I have to write a method in the declarative style that accepts a list of users and counts users based on category and also based on marketingChannel individually (i.e not groupingBy(... ,groupingBy(..)) ).

I am unable to do it in a single loop. This is what I have to achieve.

I coded few methods as follows:

import java.util.*;
import java.util.stream.*;
public class Main
{
    public static void main(String[] args) {
        List<User> users = User.createDemoList();
        imperative(users);
        declerativeMultipleLoop(users);
        declerativeMultipleColumn(users);
    }
    
    public static void imperative(List<User> users){
        Map<String, Integer> categoryMap = new HashMap<>();
        Map<String, Integer> channelMap = new HashMap<>();
        for(User user : users){
           Integer  value = categoryMap.getOrDefault(user.getCategory(), 0);
           categoryMap.put(user.getCategory(), value+1);
           value = channelMap.getOrDefault(user.getMarketingChannel(), 0);
           channelMap.put(user.getMarketingChannel(), value+1);
        }
        System.out.println("imperative");
        System.out.println(categoryMap);
        System.out.println(channelMap);
    }
    
    public static void declerativeMultipleLoop(List<User> users){
        Map<String, Long> categoryMap = users.stream()
        .collect(Collectors.groupingBy(
            User::getCategory, Collectors.counting()));
        Map<String, Long> channelMap = users.stream()
        .collect(Collectors.groupingBy(
            User::getMarketingChannel, Collectors.counting()));
        System.out.println("declerativeMultipleLoop");
        System.out.println(categoryMap);
        System.out.println(channelMap);
    }
    
    public static void declerativeMultipleColumn(List<User> users){
        Map<String, Map<String, Long>> map = users.stream()
        .collect(Collectors.groupingBy(
            User::getCategory,
            Collectors.groupingBy(User::getMarketingChannel, 
            Collectors.counting())));
       
        System.out.println("declerativeMultipleColumn");
        System.out.println("groupingBy category and marketChannel");
        System.out.println(map);
        
        Map<String, Long> categoryMap = new HashMap<>();
        Map<String, Long> channelMap = new HashMap<>();
        
        for (Map.Entry<String, Map<String, Long>> entry: map.entrySet()) {
            String category = entry.getKey();
            Integer count = entry.getValue().size();
            Long value = categoryMap.getOrDefault(category,0L);
            categoryMap.put(category, value+count);
            for (Map.Entry<String, Long> channelEntry : entry.getValue().entrySet()){
                String channel = channelEntry.getKey();
                Long channelCount = channelEntry.getValue();
                Long channelValue = channelMap.getOrDefault(channel,0L);
                channelMap.put(channel, channelValue+channelCount);
            }
        }
        System.out.println("After Implerative Loop on above.");
        System.out.println(categoryMap);
        System.out.println(channelMap);
    }
}
class User{
    private String name;
    private String category;
    private String marketChannel;
    
    public User(String name, String category, String marketChannel){
        this.name = name;
        this.category = category;
        this.marketChannel = marketChannel;
    }
    public String getName(){
        return this.name;
    }
    public String getCategory(){
        return this.category;
    }
    public String getMarketingChannel(){
        return this.marketChannel;
    }
    
     @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        User user = (User) o;
        return Objects.equals(name, user.name) &&
                Objects.equals(category, user.category) &&
                Objects.equals(marketChannel, user.marketChannel);
    }

    @Override
    public int hashCode() {
        return Objects.hash(name, category, marketChannel);
    }
    public static List<User> createDemoList(){
        return Arrays.asList(
            new User("a", "student","google"),
            new User("b", "student","bing"),
            new User("c", "business","google"),
            new User("d", "business", "direct")
            );
    }

The method declerativeMultipleLoop is declarative but it has a separate loop for each field. Complexity : O(noOfFields * No of users)

The problem is in declerativeMultipleColumn Method as I end up writing imperative code and multiple loops.

I want to write the above method in completely declarative and as efficient as possible. i.e Complexity : O(No of users)

Sample output:

imperative
{business=2, student=2}
{direct=1, google=2, bing=1}
declerativeMultipleLoop
{business=2, student=2}
{direct=1, google=2, bing=1}
declerativeMultipleColumn
groupingBy category and marketChannel
{business={direct=1, google=1}, student={google=1, bing=1}}
After Implerative Loop on above.
{business=2, student=2}
{direct=1, google=2, bing=1}


Solution

  • If I understand your requirement it is to use a single stream operation that results in 2 separate maps. That is going to require a structure to hold the maps and a collector to build the structure. Something like the following:

    class Counts {
        public final Map<String, Integer> categoryCounts = new HashMap<>();
        public final Map<String, Integer> channelCounts = new HashMap<>();
    
        public static Collector<User,Counts,Counts> countsCollector() {
            return Collector.of(Counts::new, Counts::accept, Counts::combine, CONCURRENT, UNORDERED);
        }
    
        private Counts() { }
    
        private void accept(User user) {
            categoryCounts.merge(user.getCategory(), 1, Integer::sum);
            channelCounts.merge(user.getChannel(), 1, Integer::sum);
        }
    
        private Counts combine(Counts other) {
            other.categoryCounts.forEach((c, v) -> categoryCounts.merge(c, v, Integer::sum));
            other.channelCounts.forEach((c, v) -> channelCounts.merge(c, v, Integer::sum));
            return this;
        }
    }
    

    That can then be used as a collector:

    Counts counts = users.stream().collect(Counts.countsCollector());
    counts.categoryCounts.get("student")...
    

    (Opinion only: the distinction between imperative and declarative is pretty arbitrary in this case. Defining stream operations feels pretty procedural to me (as opposed to the equivalent in, say, Haskell)).