algorithmgraphdepth-first-search

Why can memoisation inside the DFS lead to wrong value?


The problem is from leetcode 2976. Minimum Cost to Convert String I

The problem is that we have to convert source to target. We can use original[i] and replace it with changed[i]. Any pair of original[k] and changed[k] can be repeated at i with cost[i] != cost[k].

source = "abcd",
target = "acbe",
original = ["a","b","c","c","e","d"],
changed = ["b","c","b","e","b","e"],
cost = [ 2, 5, 5, 1, 2, 20]   (minimum cost: 28)

class Solution {
public:
    long long dfs(int source , int target,vector<vector<pair<int,int>>>& adjMatrix,vector<bool>& currentVisited,vector<vector<long long>>& dp ){
        if(source==target){
            return dp[source][target] = 0;
        }
        if(dp[source][target]!=-1){
            return dp[source][target];
        }
        if(currentVisited[source]){
            return LLONG_MAX;
        }
        long long totalCost = LLONG_MAX;
        currentVisited[source] = true;
        for(int i=0;i<adjMatrix[source].size();i++){
            int child = adjMatrix[source][i].first;
            int cost = adjMatrix[source][i].second;
            auto tempres = dfs(child,target,adjMatrix,currentVisited,dp);
            if(tempres!=LLONG_MAX){
                tempres+=cost;
            }
            totalCost = min(totalCost,tempres);
        }
        currentVisited[source] = false;
        return dp[source][target] = totalCost;
    }
    
    long long minimumCost(string source, string target, vector<char>& original, vector<char>& changed, vector<int>& cost) {
        vector<vector<pair<int,int>>> adjMatrix(26,vector<pair<int,int>>());
        unordered_map<string,int> ump;
        // storing only unique combination 
        for(int i=0;i<original.size();i++){
            string key = "";
            key.push_back(original[i]);
            key.push_back(changed[i]);
            if(ump.count(key)>0){
                ump[key]  = min(ump[key],cost[i]);
            }else{
                ump[key] = cost[i];
            }
        }
        for(auto key: ump){
            adjMatrix[key.first[0]-'a'].push_back({key.first[1]-'a',key.second});
        }
        long long count = 0;
        vector<vector<long long>> dp(26,vector<long long>(26,-1));
        for(int i=0;i<26;i++){
            for(int j=0;j<26;j++){
                vector<bool> currentVisited(26,0);
               dfs(i,j,adjMatrix,currentVisited,dp);
            }
        }
        for(int i=0;i<source.size();i++){
            if(source[i]!=target[i]){
                int cost = dp[source[i]-'a'][target[i]-'a'];
                if(cost==LLONG_MAX){
                    return -1;
                }
                count+=cost;
            }
        }
        return count;
    }
};

Is dp[source][target] = totalCost; a premature assignment?
My understanding was for each recursive source since we are traversing through all its links it will be correctly assigned for each recursive calls.

I understand the given approach is not optimal but I am more interested in understanding why this memo will go wrong.

Was expecting it to work fine but it seems
dp[source][target] = totalCost; is giving incorrect result.
But if I just return totalCost and instead store dp[i][j] = dfs(i,j,adjMatrix,currentVisited,dp); it's working normally.

Not able to understand why first case might fail and not able to come with a (minimal) example graph where this will cause failure.


Solution

  • Found why this might go wrong . 
    if we start function in search for a->b
    lets say graph is 
    a->c 7
    c->a 2 
    a->b 1
    starting call f(a,b) find path from a to b . and a is marked visited
    f(a,b) : can go to(c or b)
        f(c,b): can go to(a)
            f(a,b):
                here a is already visited so LLONG_MAX is returned making it seem that f(c,b) is impossible . But thats not true .
                Issue happened because a was visited already while .So incorrect memo because visited is not allowing to navigate path from c->a->b
                return LLONG_MAX causing f(c,b) to store LLONG_MAX which is incorrect
        f(a,b):
            // will memorise properly f(a,b)