code-golfanagram

Code golf: find all anagrams


A word is an anagram if the letters in that word can be re-arranged to form a different word.

Task:

Input:

a list of words from stdin with each word separated by a new line.

e.g.

A
A's
AOL
AOL's
Aachen
Aachen's
Aaliyah
Aaliyah's
Aaron
Aaron's
Abbas
Abbasid
Abbasid's

Output:

All sets of anagrams, with each set separated by a separate line.

Example run:

./anagram < words
marcos caroms macros
lump's plum's
dewar's wader's
postman tampons
dent tend
macho mocha
stoker's stroke's
hops posh shop
chasity scythia
...

I have a 149 char perl solution which I'll post as soon as a few more people post :)

Have fun!

EDIT: Clarifications

EDIT2: More Clarifications


Solution

  • Powershell, 104 97 91 86 83 chars

    $k=@{};$input|%{$k["$([char[]]$_|%{$_+0}|sort)"]+=@($_)}
    $k.Values|?{$_[1]}|%{"$_"}
    

    Update for the new requirement (+8 chars):

    To exclude the words that only differ in capitalization, we could just remove the duplicates (case-insensitvely) from the input list, i.e. $input|sort -u where -u stands for -unique. sort is case-insenstive by default:

    $k=@{};$input|sort -u|%{$k["$([char[]]$_|%{$_+0}|sort)"]+=@($_)} 
    $k.Values|?{$_[1]}|%{"$_"} 
    

    Explanation of the [char[]]$_|%{$_+0}|sort -part

    It's a key for the hashtable entry under which anagrams of a word are stored. My initial solution was: $_.ToLower().ToCharArray()|sort. Then I discovered I didn't need ToLower() for the key, as hashtable lookups are case-insensitive.

    [char[]]$_|sort would be ideal, but sorting of the chars for the key needs to be case-insensitive (otherwise Cab and abc would be stored under different keys). Unfortunately, sort is not case-insenstive for chars (only for strings).

    What we need is [string[]][char[]]$_|sort, but I found a shorter way of converting each char to string, which is to concat something else to it, in this case an integer 0, hence [char[]]$_|%{$_+0}|sort. This doesn't affect the sorting order, and the actual key ends up being something like: d0 o0 r0 w0. It's not pretty, but it does the job :)