algorithmlanguage-agnosticdynamic-programming

Number of Common sub sequences of two strings


Revisiting Longest Common Subsequence, I was wondering what is the number of common subsequences of 2 strings. I tried to formulate a recurrent relation : DP[i][j] represents the number of subsequences ending at max i in first string and j in second. Recurrent relation being :

DP[i][j]= DP[i-1][j-1] + DP[i][j-1] + DP[i-1][j] when A[i]==B[j] (A is my first string while B is the second)

and

DP[i][j]=DP[i-1][j]+DP[i][j-1] other wise

It is not giving correct results!

Explanation if A[i]==B[j], then , We can add A[i] to every common subsequence till i-1 and j-1 to form DP[i-1][j-1] new subsequences. Also We have to add the subsequences formed by dropping the last character from each string.

other wise we can only add the ways by dropping last characters!

If someone can correct this recurrence? Morover , I will love to see a formal proof.(Just getting into the habit of seeing a formal proof(You can provide a hint to prove it myself too.))

EDIT: I forgot to mention the base cases I was considering

DP[i][0]=1 for all i in length of A

and DP[0][j]=1 for all j in length of B

and also

DP[0][0]=1

(denoting empty subsequence)


Solution

  • DP[i-1][j] and DP[i][j-1] shall have DP[i-1][j-1] common sub-sequences which will be double counted.

    Change your recurrence to:

    DP[i][j]= DP[i][j-1] + DP[i-1][j] when A[i]==B[j]

    and

    DP[i][j]=DP[i-1][j]+DP[i][j-1]-DP[i-1][j-1] other wise

    Explanation:
    In your original relations, I have just subtracted a term DP[i-1][j-1]. This is because DP[i][j-1] and DP[i-1][j] both include DP[i-1][j-1]. Since we are adding these two terms, the term DP[i-1][j-1] is getting double counted, so we need to subtract it once.