javalistmethodsintersectionlinkedhashset

Two unsorted lists Intersection returned as a list


My question is if its possible to get 2 unsorted lists, and get the intersection of both lists according to the order it was in the first list "List1".

public static List intersection(List A, List B) {

    List outcome = null;

    try {
        outcome = A.getClass().newInstance();
    } catch (Exception e) {};

    LinkedList<Integer> temp = new LinkedList<>();

    LinkedHashSet<Integer> ALinkedSet = new LinkedHashSet<>(A); 
    LinkedHashSet<Integer> BLinkedSet = new LinkedHashSet<>(B);

    // first filter elements into temp
    while (ALinkedSet.size() > 0) { 
        int v = ALinkedSet.removeFirst(); 
        if (BLinkedSet.contains(v)) { 
            temp.addLast(v);      
        }
    }

    // add filtered values back to L1
    while (temp.size() > 0) {     
        outcome.addLast(temp.removeFirst()); 
    }

    return outcome;
}

I am looking for a way to make this work, and perhaps turn it to O(n).

Here is the simple way figured. Would there be a better way to turn big O into linear? I am pretty sure this is O(n*n) at the very least.

public static List Intersection(List A, List B) {

    List outcome = null;

    try {
        tulos = A.getClass().newInstance();
    } catch (Exception e) {};

    LinkedHashSet<Integer> AHashSet = new LinkedHashSet<>(A); 
    LinkedHashSet<Integer> BHashSet = new LinkedHashSet<>(B);

    for(Integer Aitem : AHashSet){
        for(Integer Bitem : BHashSet){
            if(Aitem==Bitem) {
                outcome.add(Aitem);
            }
        }
    }

    return outcome;
}

Would the following come out as linear ?

public static List Intersection(List A, List B) {

 List outcome = null;

   try {
        tulos = A.getClass().newInstance();
    } catch (Exception e) {};

 LinkedHashSet<Integer> BHashSet = new LinkedHashSet<>(B);

    for(Object Aitem : A) {
        if(BHashSet.contains(Aitem) && !outcome.contains(Aitem)){
         outcome.add(Aitem);

           }
    }
    return outcome;

}

Solution

  • Update: Since we can only use LinkedHashMaps...

    ...and the lists can have duplicates:

    1. Create a ListEntry class that has the number and the count of total times the number is duplicated in the list. So if 2 occurs twice, we will create a new ListEntry(number: 2, count: 2); in the LinkedHashMap (LHM);
    2. Populate a LHM of the ListEntry objects, by using the numbers present in the first list;
    3. Now, iterate over the second list, going through its numbers one by one:
      1. If the second list's number is not found, continue to next number;
      2. For every number found, bring its entry in the LHM to the front, store its "seen" count in a hashmap (numberSeen), and update another counter (intersectionCount) that tracks total intersecting numbers seen so far;
    4. After the iteration over the second list is done, you have the intersecting numbers in the front of the LHM;
    5. Now it is trivial to use numberSeen and intersectionCount and create your final list.

    The run-time complexity is again O(m + n) and space complexity is O(n).


    Original Response:

    Assuming the size of the first list is n and the size of the second list is m, this is a simple and straightforward solution that works in O(m + n) time and O(m + n) space.

    1. Take the second list and put all its elements in a hash map mapping Integer to Integer. This is to keep track of duplicates in the list. If there are no duplicates in your lists, simply use a hash set;
    2. Now iterate over the first list and for each element in the list:
      1. If the element exists in the map and has a positive count, put this element in a new list and decrease its count in the map;
      2. If the element does not exist in the map or has a count of 0, skip to next element in the list.

    In the end, your new list will contain the intersection of the two lists, in the order in which they appeared in the first list.


    Total space = m for the hashmap + n for the new list.

    Total time = m for putting elements in the hashmap + n for iterating over the first list.

    Thus, O(m + n) time and O(m + n) space.