I wrote and then rewrote a function because the initial behavior was super bizarre and I didn't know enough about Haskell to understand it. I fixed the behavior with the second version, but I am curious about what was going on with the first one.
First version of the function:
smallerPowers :: Int -> [Int]
smallerPowers k = [10^i | i <- [0,1..], 10^i < k]
I loaded the module I had written this function in into ghci and ran
smallerPowers 5
and got an infinite list being generated, the tail of which is all zeros. So I did
take 5 (smallerPowers 5)
-- output: [1,-8446744073709551616,-2537764290115403776,-6930898827444486144,-4570789518076018688]
which was really weird.
This second version fixed everything, e.g. if I run on 5
I get [1]
:
smallerPowers :: Int -> [Int]
smallerPowers k = [10^i | i <- [0,1.. k], 10^i < k]
Can anyone explain the output I got from the first version of this function?
The list comprehension [10^i | i <- [0,1..], 10^i < k]
basically translates into English as:
Make a list of elements of the form 10^i
, where i
is an element of [0,1..]
, filtering to only include elements where 10^i < k
.
So right off the bat we can tell that list is going to be infinite1; you're generating its elements by selecting i
from an infinite list!
I imagine you were thinking that using the properties of mathematical integers 10^i < k
would be false once i
gets large enough and would always be false for any larger i
, so the list comprehension would "stop at that point". But that's not actually what list comprehension syntax means; it's defined by a very simple translation to basic list operations that makes no attempt to do mathematical proofs to find out if there is a cut-off point in your generators after which no more elements would pass your filters. It just explores the entire generator space [0,1..]
, applying the filter to every single one. If there's a point after which the filter never passes any elements, then trying to enumerate the list comprehension past that point won't stop and find the end of the list, it'll just keep testing and discarding elements from the infinite generator forever.
The reason you don't see this behaviour (which would look like hanging if you tried to print the list past a certain point) is that you've also been bitten by the issue that Int
is not actually a type representing mathematical integers. If you want that you should use Integer
instead. Int
only represents integers that can fit in a fixed number of bits (usually 32 or 64 bits, but it depends on they platform you're on). The maximum value of a signed 64-bit int is 9,223,372,036,854,775,807. That looks like a very big number, but it's only 19 digits. So i
doesn't have to get very large at all for the "true" value of 10^i
to be too large to be stored in an Int
. But arithmetic on Int
s is defined to overflow when that happens, effectively wrapping around and counting up from the minimum value when the result would otherwise exceed the maximum value.
That means for certain values of i
, 10^i
is a negative number, which of course is less than 5
so it gets included in the output of smallerPowers 5
. The infinite tail of 0s would be because at some point the (extremely large) "true" number happens to be all-zero in the part that can actually be represented as an Int
; continuing to multiply that number by 10 only produces more numbers that are 0 (in Int
arithmetic), so there will only be zeroes after that point. But your code as written still says they should be included in the result of smallerPowers 5
(they are Int
s of the form 10^i
that are less than 5), with no reason to stop generating and terminate the list.
1 Technically it's not infinite, because the expression [0,1..]
is also using Int
, not Integer
. It will actually stop once i
hits the maximum bound of an Int
. But actually reaching the end of this list takes an impractically long time, so we don't really need to worry about this particular consequence of the finite range of Int
.