The problem: compute the length of the longest matched/valid substring of parentheses given a string composed only of (
and )
characters.
Here's a naive solution:
isValid :: String -> Bool
isValid = go' (0 :: Int)
where
go' = memo2 go
go 0 [] = True
go 0 (')' : _ ) = False
go n (')' : cs) = go' (n-1) cs
go n ('(' : cs) = go' (n+1) cs
go _ _ = False
solution :: String -> Int
solution = longest . valid . substrings
where
substrings = tails >=> inits
valid = filter isValid
longest = maximum . map length
Now I find I can come up with these inefficient, simple solutions to programming problems quite easily, but the efficient versions are more elusive. The whole idea of Bird-Meertens formalism is that if you can come up with a slow version, the calculus will give you a fast version, and so I was hopeful that it might be of some use.
Additionally, this looks superficially similar to the example on Wikipedia, since we are computing all possible subarrays, mapping something, and taking a maximum at the end.
Can Bird-Meertens help us here?
I attempted to inline definitions and got here:
solution = maximum . map length . concat . map (filter isValid . tails) . inits
But it's not obvious to me that any of the relevant rules are applicable from this point.
I believe this can be done in linear time with quite a compact algorithm. At each position, we will compute a [Int]
whose head is the length of the longest valid string ending at that position and whose remaining elements represent lengths of earlier valid substrings separated by open parens. The answer is then just the largest head. With this intuition in mind, it should be relatively easy to read:
solution :: String -> Int
solution = maximum . map head . scanl cons [0] where
cons ns = \case
')' -> case ns of
n:n':nt -> n+n'+2:nt
_ -> [0]
'(' -> 0:ns
_ -> [0]
Unfortunately, this solution was created not via Birds-Meerten-style calculation, but rather via the Feynman algorithm:
I suspect with some effort, I could find some calculation that transforms this into an inefficient but readable specification (perhaps not the exact specification you started with), and then I could present that by reading the calculation backwards to show how to move from spec to optimized version, but is it really worth it?