javaasynchronousjava-stream

how to get the intersection of a list of lists in Java using asynchronous and using multi-thread


I have a Person class like this:

public class Person {
    public String name;
    public List<String> hobbies;


    public Person(String name, List<String> hobbies) {
        this.name = name;
        this.hobbies = hobbies;
    } 
    //setter and getter
}

Now suppose I have a list of Person objects I want to get the intersection of hobbies asynchronously. I know how to get the intersection of two lists normally but I want to do this asynchronous and multi-threaded. I


Solution

  • You can do this:

    List<Person> persons = new ArrayList<>();
    persons.add(new Person("hatef", Arrays.asList("a", "b", "c", "t")));
    persons.add(new Person("hatef", Arrays.asList("a", "b", "t")));
    persons.add(new Person("hatef", Arrays.asList("t", "a", "b")));
    persons.add(new Person("hatef", Arrays.asList("a", "b", "f", "t")));
    persons.add(new Person("hatef", Arrays.asList("a", "t", "b", "z")));
    

    then use CompletableFuture

    Map<String, Integer> all = new ConcurrentHashMap<>();
    ExecutorService executorService = Executors.newFixedThreadPool(Math.max(10, persons.size()));
    persons
            .stream()
            .map(person -> CompletableFuture.supplyAsync(() -> person, executorService))
            .map(p -> p.thenApply(Person::getHobbies))
            .map(p -> p.thenAccept(hobbies -> hobbies.forEach(hobby -> all.compute(hobby, (key, value) -> value == null ? 1 : ++value))))
            .map(CompletableFuture::join)
            .forEach(x -> {}); // just dummy consumer to make stream triggers
    
    int n = persons.size();
    Set<String> result = all
                .entrySet()
                .stream()
                .filter(entry -> entry.getValue() == n)
                .map(Map.Entry::getKey)
                .collect(Collectors.toSet());
    System.out.println(result);
    

    which results in:

    [a, b, t]
    

    If you have order in hobby list I think you can use the idea of Longest Common Prefix problem and it is efficient

    I agree with @JoachimSauer's comment:

    Can you explain what exactly you mean by asynchronously? Usually that only applies to any kind of I/O operation (reading from disk, network requests, ...) where the CPU isn't hard at work but you still spend a lot of time waiting for something. This kind of work here is purely CPU/memory stuff that doesn't actually involve any I/O, so it's very unclear what you mean by "asynchronously" in this case.