Apologies if this is too specific, I am new here and not exactly sure what is reasonable. I have been bashing my head against this problem for hours with nothing to show for it. The following code is my implementation of a competitive programming problem.
module Main where
import Data.List (foldl', groupBy)
import Debug.Trace
type Case = (Int, [(Int, Int)])
type Soln = Int
main = interact handle
handle :: String -> String
handle = fmt . solve . parse
fmt :: Soln -> String
fmt s = (show s) ++ "\n"
parse :: String -> Case
parse s = (l, fs)
where
(l:_:fs') = (map read) $ words s
fs = pairs fs'
pairs :: [a] -> [(a, a)]
pairs [] = []
pairs (a:b:s) = (a, b):(pairs s)
solve :: Case -> Soln
solve c@(l, fs) = last $ foldl' run [0..l] f
where
f = concat $ map rep $ map combine $ groupBy samev fs
samev a b = (snd a) == (snd b)
combine a = (sum $ map fst $ a, snd $ head $ a)
rep (n, v) = replicate (min n (l `div` v)) v
run :: [Int] -> Int -> [Int]
run b v = (take v b) ++ (zipWith min b (drop v b))
-- run b v = (take v b) ++ (zipMin b (drop v b))
zipMin :: [Int] -> [Int] -> [Int]
zipMin [] _ = []
zipMin _ [] = []
zipMin (a:as) (b:bs) = (min a b):(zipMin as bs)
The intent is that this works like a bottom-up dynamic programming solution generating each row of the DP table from the previous using the fold in solve. In theory GHC should be able to optimize out all the old rows of the table. However, running this program on a moderately large input with approximately l = 2000
and length f = 5000
, I get this:
> time ./E < E.in
0
1.98user 0.12system 0:02.10elapsed 99%CPU (0avgtext+0avgdata 878488maxresident)k
0inputs+0outputs (0major+219024minor)pagefaults 0swaps
That's using 878 MB of memory at peak! The table I am generating is only 10,000 Ints, and in theory I only need one row at a time! It seems obvious that this is some form of thunk leak or other space leak. Profiling reveals that run
is consuming 99% of total runtime and memory. Digging further indicated that 98% of that was in the zipWith
call. Interestingly, replacing the call to zipWith min
with my custom zipMin
function produces a significant improvement:
> time ./E < E.in
0
1.39user 0.08system 0:01.48elapsed 99%CPU (0avgtext+0avgdata 531400maxresident)k
0inputs+0outputs (0major+132239minor)pagefaults 0swaps
That's just 70% the run time, and 60% the memory! I tried all sorts to make this work. I know (++)
is generally a bad idea, so I replaced the lists in run
with Data.Sequence.Seq Int
... and it got slower and used more memory. I am not particularly experienced with Haskell, but I am at my wit's end here. I am sure the answer to this problem exists somehwere on SO, but I am too new to Haskell to be able to find it, it seems.
Any help any of you can offer is very much appreciated. I would love an explanation of exactly what I have done wrong, how to diagnose it in future, and how to fix it.
Thanks in advance.
EDIT:
After following Steven's Excellent advice and replacing my lists with unboxed vectors the performance is... uh... signficantly improved:
> time ./E < E.in
0
0.01user 0.00system 0:00.02elapsed 80%CPU (0avgtext+0avgdata 5000maxresident)k
24inputs+0outputs (0major+512minor)pagefaults 0swaps
So, by using foldl'
you have ensured that the accumulator will be in WHNF. Putting a list in WHNF only evaluates the first element of the list. The remainder of the list exists as a thunk, and will be passed around as a thunk to the subsequent calls of your fold. Since you are interested in multiple positions in the list at once (that is, you are dropping some parts of it in the zipWith
) large portions of the lists are being kept from previous iterations.
The structure you need here is an unboxed vector. An unboxed vector will ensure that everything is maximally strict, and will run in far less memory.