Forgive me if my code is off. I've still got my head in Ruby on Rails which seems to have subtle differences that are coming out as I learn more "just Ruby," although to be fair I'm not sure my code would pass muster in a Ruby on Rails format. I digress.
I am trying to compare two arrays that contain a set of strings. I want to do a couple things. 1) Ensure that the arrays are the same number of words, otherwise the exercise is moot. 2) Compare the first word in an array with only the first word in the second array. In other words, I never want to compare word 1 in array "a" with word 4 in array "b". I'm struggling to find a solution that reorders the characters in any given word, compares it to the reordered characters in the corresponding word in the second array, and prints 1 if it's an anagram (once sorted, the idea is that the two words would be equivalent) or 0 if they do not match.
In the example below, what I want it to print is:
0
0
1
1
...but that is not happening. Thoughts? I'm afraid this has to do with local variable issues, but I am not be sure.
a = ['hello', 'goodbye', 'pants', 'baa']
b = ['helio', 'godbye', 'spant', 'aba']
x = a.length
y = b.length
z = 0
x = y? do
while z < x do
if a.find(z).chars.sort.join == b.find(z).chars.sort.join
puts 1
else
puts 0
end
z += 1
end
end
[Edit: I've edited my answer to incorporate an efficiency improvement suggested by @raph in a comment on the question (the method anagram?
below). That may not be necessary, but I thought it was such a good idea that it should get some exposure. I've also given a detailed explanation, as the OP is new to Ruby, as might be other readers.]
You might consider doing it as follows.
Code
def anagrams(a, b)
return nil unless a.size == b.size
a.zip(b).map { |aw,bw| anagram?(aw,bw) ? 1 : 0 }
end
def anagram?(aw, bw)
return false unless aw.size == bw.size
counts = aw.downcase.each_char.with_object(Hash.new(0)) { |c,h| h[c] += 1 }
bw.downcase.each_char do |c|
return false unless counts[c] > 0
counts[c] -= 1
end
true
end
Example
a = ['hello', 'goodbye', 'pants', 'baa']
b = ['helio', 'godbye', 'Spant', 'aba']
anagrams(a, b)
#=> [0, 0, 1, 1]
Explanation
anagrams
method
For the example above,
a.size #=> 4
b.size #=> 4
so we don't return nil
in the first line of anagrams
.
Next,
c = a.zip(b)
#=> [["hello", "helio"], ["goodbye", "godbye"],
# ["pants", "Spant"], ["baa", "aba"]]
Assuming for a moment that anagram?
works as desired:
c.map { |e| anagram?(e.first, e.last) ? 1 : 0 }
#=> [0, 0, 1, 1]
Enumerable#map passes each element of c
(a two-element array) into the block.1. It is clearer, however, to decompose (or "disambiguate") those arrays and assign each of the two words they comprise to a block variable2:
c.map { |aw,bw| anagram?(aw,bw) ? 1 : 0 }
#=> [0, 0, 1, 1]
The first element passed in is ["hello", "helio"]
, so
aw => "hello"
bw #=> "helio"
and we execute
anagram?("hello", "helio") ? 1 : 0
#=> 0
which is shorthand for
if anagram?("hello", "helio")
1
else
0
end
#=> 0
anagram?
method
So now let's move on to anagram?
, with
aw = "hello"
bw = "helio"
Since
aw.size == bw.size #=> true
we don't return.
Count frequency of letters in the first word
Let me now write the next few lines of anagram?
slightly differently:
counts = Hash.new(0)
#=> {}
aw_down = aw.downcase
#=> "hello"
aw_down.each_char { |c| counts[c] += 1 }
#=> "hello"
counts
#=> {"h"=>1, "e"=>1, "l"=>2, "o"=>1}
(The last line is there just to show the value of the hash.)
In the first line we create a hash counts
with a default value of zero. All this means is that if counts
does not contain the key k
, counts[k]
will return the default value. Very important: doing so does not change the hash!3
String#each_char4 passes each character of "hello"
into the block and assigns it to the block variable c
. Initially, c='h'
and h={}
. We then execute
counts['h'] += 1
which is shorthand for
counts['h'] = counts['h'] + 1
Since counts
does not yet have a key 'h'
, counts['h']
on the right returns the default value:
counts['h'] = 0 + 1 #=> 1
counts #=> {"h"=>1}
Similarly, after 'e'
and the first 'l'
are passed to the block, we have:
counts #=> {"h"=>1, "e"=>1, "l"=>1}
However, when we pass the second 'l'
, we execute
counts['l'] = counts['l'] + 1
#=> 1 + 1
#=> 2
and we finish up with
counts #=> {"h"=>1, "e"=>1, "l"=>2, "o"=>1}
The method Enumerable#each_with_object will become a good friend
This method is used merely to save some steps. It allows us to write:
counts = Hash.new(0)
aw_down.each_char { |c| counts[c] += 1 }
as
counts = aw_down.each_with_object(Hash.new(0)) { |c,h| h[c] += 1 }
and we can also get rid of the line
aw_down = aw.downcase
by writing
counts = aw.downcase.each_char.with_object(Hash.new(0)) { |c,h| h[c] += 1 }
This may seem like a small saving, but there are many other situations where the use of each_with_object
and other Enumerable
class methods permit the chaining of methods, which is extremely useful.
Decrementing letter counts for letters in the second word
Recall
counts #=> {"h"=>1, "e"=>1, "l"=>2, "o"=>1}
We now execute
bw_down = bw.downcase
#=> "helio"
"helio".each_char do |c|
return false unless counts[c] > 0
counts[c] -= 1
end
First, 'h'
is passed into the block. As counts['h'] #=> 1
, we execute counts['h'] -= 1
, so now
counts #=> {"h"=>0, "e"=>1, "l"=>2, "o"=>1}`.
After passing 'e'
and 'l'
to the block,
counts #=> {"h"=>0, "e"=>0, "l"=>1, "o"=>1}
but when we pass 'i'
, we find
counts['i'] #=> 0
(i.e., the default value of zero is returned, and we don't want to set counts['i']
to -1
) so we return false
, having concluded that the two words are not anagrams. (Had the second word been "heeio"
, we would have returned false
when the second 'e'
was passed to the block.)
Do we have an anagram?
Since two two words have the same length, if we are able to process all characters of the second word without returning false
, we must end up with
counts #=> {"h"=>0, "e"=>0, "l"=>0, "o"=>0}
(no need to check!), meaning the two words are anagrams, so in this case we would return true
to anagrams
.5 Hence, the last line of anagram?
.
Notes
1 Under the hood, this is what's happening:
enum = c.map
#=> #<Enumerator: [["hello", "helio"], ["goodbye", "godbye"],
# ["pants", "Spant"], ["baa", "aba"]]:map>
Here we can see what elements the enumerator will pass into the block, but sometimes you need to convert the enumerator to an array to get that information:
enum.to_a
#=> [["hello", "helio"], ["goodbye", "godbye"],
# ["pants", "Spant"], ["baa", "aba"]]
It is actually the method Array#each that passes the elements of enum
into the block:
enum.each { |aw,bw| anagram?(aw,bw) ? 1 : 0 }
#=> [0, 0, 1, 1]
2 If we pass [[1,2],3]
into a block, and the block variables are written |(a,b),c|
, then a=>1
, b=>2
, c=>3
. This is quite handy. Cool, eh?.
3
h = Hash.new('pig')
h['dog'] = 7 #=> 7
h #=> {"dog"=>7}
h[0] #=> "pig"
h['cat'] #=> "pig"
h[{:a=>1}] #=> "pig"
h #=> {"dog"=>7}
Note there is a form of Hash#new that takes block, which allows keys not in the hash to be added when they are referenced.
4 Instead of aw_down.each_char
we could have written aw_down.chars.each
, but aw_down.chars
creates an unnecessary intermediate array. each_char
, an enumerator, merely passes values as they are required.
5 We could return 0
rather than false
and 1
rather than true
, in which case we could write
a.zip(b).map { |aw,bw| anagram?(aw,bw) }
in anagrams
, but wouldn't it be clearer to have anagrams
return an array whose values are true
or false
, rather than 0
or 1
?