pythonsortingradix-sortcounting-sort

How to sort a text file to find anagrams in O(MN) time complexity where M is the max number of characters and N is the number of words?


Hi I'm new to Python and I want to find groups of anagrams in a file in linear time.Anagrams are basically two or more words which consists of the same alphabet but in different arrangements.

First I read all the words into a list. Apparently, I can use radix sort to sort each word by column and the radix sort should use a stable counting sort. But I don't know exactly what my counting sort should do? Should I write a counting sort function that takes a word and sorts it by alphabetical order? Then call it in radix sort?

Can someone give me a clearer idea on how to approach this problem?Any help will be appreciated thanks!


Solution

  • Hope this helps you. It too is in O(mn)

    public List<List<String>> groupAnagrams(String[] strs){
        List<List<String>> result = new ArrayList<List<String>>();
        
        HashMap<ArrayList<Integer>, ArrayList<String>> map = new HashMap<ArrayList<Integer>, HashMap<ArrayList<String>>;
        
        for(String str : strs){
            ArrayList<Integer> arr = new ArrayList<Integer>();
            for(int i=0; i<str.length(); i++){
                arr[str.charAt(i)- 'a']++;
            }
            
            if(map.containsKey(arr)){
                map.get(arr).add(str);
            } else {
                ArrayList<String> al = new ArrayList<String>();
                al.add(str);
                map.put(arr, al);
            }
        }
        
        result.addAll(map.values());
        
        return result;
    }