I was working in Python and created an example of how we can generalize the usual sumList
function to sum any arbitrary list of lists:
def sumList(xs):
if xs==[]:
return 0
else:
return xs[0]+sumList(xs[1:])
And in Haskell:
sumList [] = 0
sumList (x:xs) = x + sumList xs
def sumList2(xs):
if xs==[]:
return 0
elif isinstance(xs,int):
return xs
else:
return sumList2(xs[0])+sumList2(xs[1:])
But the following Haskell code throws a compiler error in the type inference:
sumList [] = 0
sumList (x:xs) = sumList x + sumList xs
sumList x = x
<interactive>:2:38: error:
• Couldn't match expected type ‘t’ with actual type ‘[t]’
‘t’ is a rigid type variable bound by
the inferred type of sumList :: t -> [t]
at <interactive>:(1,1)-(3,13)
• In the first argument of ‘sumList’, namely ‘xs’
In the second argument of ‘(+)’, namely ‘sumList xs’
In the expression: sumList x + sumList xs
• Relevant bindings include
xs :: [t] (bound at <interactive>:2:12)
x :: t (bound at <interactive>:2:10)
sumList :: t -> [t] (bound at <interactive>:1:1)
I am using IHaskell, for what I understand, when using sumlist [] = 0
as pattern matching, Haskell inferrs that sumList :: [a] -> number
, but then when I pass sumList
x
in the next line, x
is the head of the list, so sumList :: a -> number
, so it gets an absurd. What I figured is that I need to match the case where the input is a list and do the 2 first lines, and then the case where the input is an integer and do the last line, but I cannot figure out how to do it.
I am trying to create a sum type of Either Int [a]
, but I don't know if that is the better answer, because I want to let the compiler figure out everything related to types (ultimately I will write the type definition, but my initial thesis is that contrary to popular belief, it is possible to do quick experimentation in Haskell, and so that means to not be concerned with types in the first stage of programing). Note the shortcomings of the for loop, it would be very difficult or impossible to generalize the sum of the list using for loops, because of the linearity of the loop, I think it is a very compelling argument for the scaling of machine learning software.
Int
, [Int]
, [[Int]]
, and so on are different types, and the type of a function is chosen at the moment it's called. If we expect our function to return an Int
, and we have the two clauses
sumList (x:xs) = {- does not matter what goes here -}
sumList x = x
then we're guaranteed that no matter which input type we choose, we have a type error. If we choose Int
, then the first clause is an error because x:xs
can only match lists; if we choose any list type, then the second clause is an error because it does not return an Int
.
However, not all is lost. One thing we can do is split these two clauses into separate functions that happen to share a name using type classes.
class Sums a where sums :: a -> Int
instance Sums Int where sums x = x
instance Sums a => Sums [a] where
sums [] = 0
sums (x:xs) = sums x + sums xs
Now when we call sums
, we are never calling a function that has both clauses; instead, we are asking the compiler to call one or the other of these functions depending on which concrete type was chosen. The line
sums (x:xs) = sums x + sums xs
may look suspicious, since we know that x
and xs
do not have the same type. But we are rescued by the fact that these are not calling the same sums
function! Again, the type of a function is chosen at the moment it's called, and here there are two different calls that may choose two different types, at which point the compiler will choose different functions to call.
The other option we have is to make a new type. The trouble with Int
, [Int]
, [[Int]]
, and so on being different types is that this means we must statically know how deeply nested a list is. If we don't know that, then we need a type that can express the fact that its nesting depth is only known dynamically, and which tracks the data it needs to know to determine how deep that nesting is. One way to do so would be:
data DynList a = NotNested a | Nested [DynList a]
sumDynList :: DynList Int -> Int
sumDynList (NotNested x) = x
sumDynList (Nested []) = 0
sumDynList (Nested (x:xs)) = sumDynList x + sumDynList (Nested xs)
I've written this to match as closely as possible your original code, but in reality I'd probably write the nested case this way:
sumDynList (NotNested x) = x
sumDynList (Nested xs) = sum (map sumDynList xs)
This particular type allows different amounts of nesting at different points in the data structure. If the intent is instead that there is a single nesting depth, and all list elements have that same depth, then we need a slightly more advanced technique.
data SharedDepthList a = Depth0 a | Deeper (SharedDepthList [a])
sumSharedDepthList :: SharedDepthList Int -> Int
sumSharedDepthList = go id where
go :: (a -> Int) -> SharedDepthList a -> Int
go f (Depth0 x) = f x
go f (Deeper xs) = go (sum . map f) xs
I think mostly I do not recommend this. Data structures with this much precision about the allowed possible values get quite unwieldy in practice.