pythonalgorithmdirected-acyclic-graphslongest-pathweighted-graph

Function takes a long time


im currently working on trying to get the the number of unique paths from node 1 .. N of maximum length for a weighted directed acyclic graph, i have worked out getting the max length but i am stuck on getting the NUMBER of paths of that given max length...

Data is inputted like this:

91 120  # Number of nodes, number of edges

1 2 34

1 3 15

2 4 10

.... As Node 1-> Node 2 with a weight of 34,

I input my data using a diction so my dict looks like:
_distance = {}
_distance = {1: [(2, 34), (3, 15)], 2: [(4, 10)], 3: [(4, 17)], 4: [(5, 36), (6, 22)], 5: [(7, 8)],...ect

I have worked out how to achieve the longest length of the paths using this:

first i make a list of vertices

class Vertice:
    def __init__(self,name,weight=0,visted=False):
        self._n = name
        self._w = weight
        self._visited = visted
        self.pathTo

for i in range(numberOfNodes): # List of vertices (0-n-1)
  _V = Vertice(i) 
  _nodes.append(_V)

next i iterate through my dictionary setting each node to the maximum weight it can be

        for vert, neighbors in _distance.iteritems():
        _vert = _nodes[vert-1] # Current vertice array starts at 0, so n-1


        for x,y in neighbors:  # neighbores,y = weight of neighbors
            _v = _nodes[x-1]   # Node #1 will be will be array[0]

            if _v._visited == True:
                if _v._w > _vert._w+y:
                    _v._w = _v._w
                else:
                    _v._w = y + _vert._w

            else:

                _v._w = y + _vert._w
                _v._visited = True

with this done, the last node will have a weight of the maximum so i can just call

max = _nodes[-1]._w

to get the max weight. This seems to perform fast and has no trouble finding the max length path even when performed on the bigger data set, i then take my max value and run it into this function:

#  Start from first node in dictionary, distances is our dict{}
#  Target is the last node in the list of nodes, or the total number of nodes.
numLongestPaths(currentLocation=1,target=_numNodes,distances=_distance,maxlength=max)

def numLongestPaths(currentLocation,maxlength, target, sum=0, distances={}):


    _count = 0

    if currentLocation == target:
        if sum == maxlength:
                _count += 1

    else:
        for vert, weight in distances[currentLocation]:
            newSum = sum + weight
            currentLocation = vert
            _count += numLongestPaths(currentLocation,maxlength,target,newSum,distances)

    return _count

I simply check once we have hit the end node if our current sum is the max, if it is, add one to our count, if not pass.

This works instantly for the inputs such as 8 nodes and longest path is 20, finding 3 paths, and for inputs such as 100 nodes, longest length of 149 and only 1 unique path of that length, but when i try to do a data set with 91 nodes such as longest path 1338 and number of unique paths are 32, the function takes extremely LONG, it works but is very slow.

Can someone give me some tips on what is wrong with my function to cause it to take so long finding the # of paths length X from 1..N? i'm assuming its getting an exponential run time but i'm unsure how to fix it

Thank you for your help!

EDIT: Okay i was overthinking this and going about this the wrong way, i restructured my approach and my code is now as follows:

# BEGIN SEARCH.
    for vert, neighbors in _distance.iteritems():
        _vert = _nodes[vert-1] # Current vertice array starts at 0, so n-1


        for x,y in neighbors:  # neighbores

            _v = _nodes[x-1]   # Node #1 will be will be array[0]

            if _v._visited == True:
                if _v._w > _vert._w+y:
                    _v._w = _v._w
                elif _v._w == _vert._w+y:
                        _v.pathsTo += _vert.pathsTo
                else:
                    _v.pathsTo = _vert.pathsTo
                    _v._w = y + _vert._w

            else:

                _v._w = y + _vert._w
                _v.pathsTo = max(_vert.pathsTo, _v.pathsTo + 1)
                _v._visited = True

i added a pathsTo variable to my Vertice class, and that will hold the number of unique paths of MAX length


Solution

  • Your numLongestPaths is slow because you're recursively trying every possible path, and there can be exponentially many of those. Find a way to avoid computing numLongestPaths for any node more than once.

    Also, your original _w computation is broken, because when it computes a node's _w value, it does nothing to ensure the other _w values it's relying on have themselves been computed. You will need to avoid using uninitialized values; a topological sort may be useful, although it sounds like the vertex labels may have already been assigned in topological sort order.