I'm debugging a Haskell program using GHC's debugger (ghci
) or via stack ghci
, specifically with the Haskell GHCi Debug Adapter (Phoityne extension) in Visual Studio Code, and I'm encountering a major issue when stopping the debugger:
htop
to identify which process is consuming resources.I suspect that some resources (e.g., threads, memory allocations, or file handles) are not being properly released when stopping the debugger.
readMSet :: FilePath -> IO (MSet String)
readMSet fileName = do
content <- readFile fileName
let mset = processMSet content
return mset
processMSet :: String -> MSet String
processMSet content =
let ls = lines content
in foldl add (MS []) ls -- Create the multiset
main = do
m1 <- readMSet "path_to_file"
In particular, my debugger crashes in the main function, probably due to issues related to lazy evaluation. I suspect I might need to use DeepSeq to force evaluation, but that's not the principal issue for this topic.
Any insights or workarounds would be greatly appreciated! Is there a specific setup or practice for handling such situations in Haskell debugging?
Is there a way to set a timer or timeout to automatically clean up (free) resources if a computation gets stuck in a loop or runs for too long?
Can I configure the debugger (or ghci
) to safely terminate processes and free up system resources when I quit using the stop button?
Ctrl+C
, but the system still becomes unresponsive after stopping.One way to prevent accidental memory exhaustion on GHCi is specifying a heap size limit for GHC. Since you are invoking GHCi via Stack and the Phoityne extension, that would be done by adding --ghc-options="+RTS -M1G"
(or whatever size is reasonable instead of 1G
) to the stack ghci
invocation specified by ghciCmd
in the Phoityne configuration.
I've tried to manually interrupting with Ctrl+C, but the system still becomes unresponsive after stopping.
My unproven theory is that this behaviour has to do with garbage collection. Since GC itself requires additional heap space, if the Ctrl-C is done too close to memory exhaustion, GHC's very attempt to reclaim memory might push it over the edge. If my hunch is correct, GHC issue #24398, in which a heap overflow in GHCi with a heap size limit is followed by a second overflow which brings down GHCi, could well have the same underlying cause. (In fact, it could be worth it to report the observations here in that issue! I'll look into doing that.)
In particular, my debugger crashes in the main function, probably due to issues related to lazy evaluation. I suspect I might need to use DeepSeq to force evaluation, but that's not the principal issue for this topic.
The immediate problem is almost certainly the use of foldl
, as discussed in foldl versus foldr behavior with infinite lists. When dealing with lists, the rule of thumb is to use foldr
if the fold result will be consumed lazily, and foldl'
(the strict left fold) if it won't. In your case, building up a multiset requires evaluating the elements being inserted and their multiplicities anyway, so foldl'
should be the better choice. One important caveat, though, is that effective use of foldl'
calls for strictness in the result being built up. That being so, you'll probably find it advantageous to implement your multiset in terms of something less lazy than vanilla lists — a map using the Data.Map.Strict
could be a good option. Those changes will likely make it unnecessary to reach for deepseq
. For further discussion of this matter, see Tom Ellis' Make invalid laziness unrepresentable.