swiftalgorithmdynamic-programmingmemoizationlcs

Issue with implementation of longest common subsequence


I'm doing this challenge: https://leetcode.com/problems/longest-common-subsequence/

I've tried a number of ways to debug my code, but I can't figure out what's wrong.

Like in my head the way I visualize the traversal, starts from top left corner 0,0 with a value of 0

   o a f c
 d 0 0 0 0 
 a 0 1 1 1
 f 0 1 2 2
 c 0 1 2 3
class Solution {
    func longestCommonSubsequence(_ text1: String, _ text2: String) -> Int {
        var t1: [Character] = Array(text1)
        var t2: [Character] = Array(text2)

        var cache: [Stage: Int] = [:]

        func helper(s: Stage, current: Int) ->  Int { 
            guard s.s1 < t1.count && s.s2 < t2.count else {            
                return current
            }
        
            guard cache[s] == nil else {
                return cache[s]!
            }

            if t1[s.s1] == t2[s.s2] {
                let stage = Stage(s.s1 + 1, s.s2 + 1)
                cache[s] = helper(s: stage, current: current + 1)
                return cache[s]!                 

            } else {
                let stage1 = Stage(s.s1, s.s2 + 1)
                cache[stage1] = helper(s: stage1, current: current)
                let stage2 = Stage(s.s1 + 1, s.s2)
                cache[stage2] = helper(s: stage2, current: current)
                cache[s] = max(cache[stage1]!, cache[stage2]!, current) 

                return cache[s]!
            }
        }
        helper(s: Stage(0,0), current: 0)

        return cache.map{$1}.max()!
    }
}

struct Stage: Hashable {
    var s1: Int
    var s2: Int
    
    init(_ s1: Int, _ s2: Int) {
        self.s1 = s1
        self.s2 = s2
    }
}

let s = Solution()
assert(s.longestCommonSubsequence("ae", "be") == 1)
assert(s.longestCommonSubsequence("fabc", "dafc") == 2)
assert(s.longestCommonSubsequence("abc", "dafc") == 2)
assert(s.longestCommonSubsequence("afc", "dafc") == 3)
assert(s.longestCommonSubsequence("oafc", "dafc") == 3) // FAILURE: it's equal to 1

Solution

  • The problem is that your logic makes no sense. You shouldn't even be passing current along at all. To see why, let's go back to the definition of the longest common subsequence:

    Define lcs(i, j) as the longest common subsequence of strings s[i..] and t[j..] (i.e. the LCS of the suffixes starting at the i-th and j-th character of s and t, respectively). Then lcs(0, 0) is the answer to the problem.

    We can define lcs(i, j) recursively as follows (in pseudo-code):

    lcs(i, j)
      | if i = |s| or j = |t| -> 0
      | if s[i] = s[j] -> 1 + lcs(i + 1, j + 1)
      | otherwise -> max(lcs(i + 1, j), lcs(i, j + 1))
    

    The first rule says that if either s[i..] or t[j..] is empty, then the LCS is 0, which makes sense because you can't have any characters in common with an empty string.

    The second rule says that if the first characters of s[i..] and t[j..] are equal, then you should keep that shared character (hence the 1 + ), and continue to solve the problem recursively for s[i+1..] and t[j+1..].

    The third rule says that if the first characters of s[i..] and t[j..] are not equal, then obviously you must delete the first character either s or t (or possibly both, but that case is handled recursively).

    Now, to implement this in Swift you can translate this logic directly into your helper:

    class Solution {
        func longestCommonSubsequence(_ text1: String, _ text2: String) -> Int {
            var t1: [Character] = Array(text1)
            var t2: [Character] = Array(text2)
    
            func helper(s: Stage) -> Int {
                guard s.s1 < t1.count && s.s2 < t2.count else {
                    return 0
                }
                if t1[s.s1] == t2[s.s2] {
                    return helper(s: Stage(s.s1 + 1, s.s2 + 1)) + 1
                }
                return max(helper(s: Stage(s.s1 + 1, s.s2)), helper(s: Stage(s.s1, s.s2 + 1)))
            }
    
            return helper(s: Stage(0, 0))
        }
    }
    

    This is actually an important step! Whenever you have a solution that uses recursion with memoization, and it doesn't work correctly, first implement the recursive solution without memoization. You shouldn't have posted a question here that uses memoization when you haven't verified that the recursive solution works.

    The above code actually gives the correct answer! But it's woefully inefficient. To fix that, simply add the memo without changing the computation logic:

    class Solution {
        func longestCommonSubsequence(_ text1: String, _ text2: String) -> Int {
            var t1: [Character] = Array(text1)
            var t2: [Character] = Array(text2)
    
            var cache: [Stage: Int] = [:]
    
            func helper(s: Stage) -> Int {
                guard s.s1 < t1.count && s.s2 < t2.count else {
                    return 0
                }
                guard cache[s] == nil else {
                    return cache[s]!
                }
                if t1[s.s1] == t2[s.s2] {
                    cache[s] = helper(s: Stage(s.s1 + 1, s.s2 + 1)) + 1
                } else {
                    cache[s] = max(helper(s: Stage(s.s1 + 1, s.s2)), helper(s: Stage(s.s1, s.s2 + 1)))
                }
                return cache[s]!
            }
            return helper(s: Stage(0, 0))
        }
    }