pythonalgorithmdynamicgreedy

Reduce the sum of differences between adjacent array elements


I came across a coding challenge on the internet the question is listed below:

Have the function FoodDistribution(arr) read the array of numbers stored in arr which will represent the hunger level of different people ranging from 0 to 5 (0 meaning not hungry at all, 5 meaning very hungry). You will also have N sandwiches to give out which will range from 1 to 20. The format of the array will be [N, h1, h2, h3, ...] where N represents the number of sandwiches you have and the rest of the array will represent the hunger levels of different people. Your goal is to minimize the hunger difference between each pair of people in the array using the sandwiches you have available.

For example: if arr is [5, 3, 1, 2, 1], this means you have 5 sandwiches to give out. You can distribute them in the following order to the people: 2, 0, 1, 0. Giving these sandwiches to the people their hunger levels now become: [1, 1, 1, 1]. The difference between each pair of people is now 0, the total is also 0, so your program should return 0. Note: You may not have to give out all, or even any, of your sandwiches to produce a minimized difference.

Another example: if arr is [4, 5, 2, 3, 1, 0] then you can distribute the sandwiches in the following order: [3, 0, 1, 0, 0] which makes all the hunger levels the following: [2, 2, 2, 1, 0]. The differences between each pair of people is now: 0, 0, 1, 1 and so your program should return the final minimized difference of 2.

My first approach was to try to solve it greedily as the following:

  1. Loop until the sandwiches are zero
  2. For each element in the array copy the array and remove one hunger at location i
  3. Get the best combination that will give you the smallest hunger difference
  4. Reduce the sandwiches by one and consider the local min as the new hunger array
  5. Repeat until sandwiches are zero or the hunger difference is zero

I thought when taking the local minimum it led to the global minimum which was wrong based on the following use case [7, 5, 4, 3, 4, 5, 2, 3, 1, 4, 5]

def FoodDistribution(arr):
    sandwiches = arr[0]
    hunger_levels = arr[1:]

    # Function to calculate the total difference
    def total_difference(hunger_levels):
        return sum(abs(hunger_levels[i] - hunger_levels[i + 1]) for i in range(len(hunger_levels) - 1))

    def reduce_combs(combs):
        local_min = float('inf')
        local_min_comb = None
        for comb in combs:
            current_difference = total_difference(comb)
            if current_difference < local_min:
                local_min = current_difference
                local_min_comb = comb

        return local_min_comb
    # Function to distribute sandwiches
    def distribute_sandwiches(sandwiches, hunger_levels):
        global_min = total_difference(hunger_levels)
        print(global_min)
        while sandwiches > 0 and global_min > 0:
            combs = []
            for i in range(len(hunger_levels)):
                comb = hunger_levels[:]
                comb[i] -= 1
                combs.append(comb)

            local_min_comb = reduce_combs(combs)
            x = total_difference(local_min_comb)
            print( sandwiches, x, local_min_comb)
            global_min = min(global_min, x)
            hunger_levels = local_min_comb
            sandwiches -= 1
        return global_min

    # Distribute sandwiches and calculate the minimized difference
    global_min = distribute_sandwiches(sandwiches, hunger_levels)
    return global_min

if __name__ == "__main__":
    print(FoodDistribution([7, 5, 4, 3, 4, 5, 2, 3, 1, 4, 5]))

I changed my approach to try to brute force and then use memorization to optimize the time complexity

  1. Recurse until out of bounds or sandwiches are zero
  2. For each location there are two options either to use a sandwich or ignore
  3. When the option is to use a sandwich decrement sandwiches by one and stay at the same index.
  4. When the option is to ignore increment the index by one.
  5. Take the minimum between the two options and return it.

The issue here is that I didn't know what to store in the memo and storing the index and sandwiches is not enough. I am not sure if this problem has a better complexity than 2^(n+s). Is there a way to know if dynamic programming or memorization is not the way to solve the problem and in this case can I improve the complexity by memorization or does this problem need to be solved with a different approach?

def FoodDistribution(arr):
    sandwiches = arr[0]
    hunger_levels = arr[1:]

    # Distribute sandwiches and calculate the minimized difference
    global_min = solve(0, sandwiches, hunger_levels)
    return global_min


def solve(index, sandwiches, hunger_levels):
    if index >= len(hunger_levels) or sandwiches == 0:
        return total_difference(hunger_levels)

    # take a sandwich
    hunger_levels[index] += -1
    sandwiches += -1
    minTake = solve(index, sandwiches, hunger_levels)
    hunger_levels[index] += 1
    sandwiches += 1

    # dont take sandwich
    dontTake = solve(index + 1, sandwiches, hunger_levels)

    return min(minTake, dontTake)


def total_difference(hunger_levels):
    return sum(abs(hunger_levels[i] - hunger_levels[i + 1]) for i in range(len(hunger_levels) - 1))

if __name__ == "__main__":
    print(FoodDistribution([7, 5, 4, 3, 4, 5, 2, 3, 1, 4, 5]))

Edit: Multiple states will give you the optimal answer for the use case above

sandwiches = 7 
hunger = [5, 4, 3, 4, 5, 2, 3, 1, 4, 5]
optimal is 6
states as follow
[3, 3, 3, 3, 3, 2, 2, 1, 4, 5]
[4, 3, 3, 3, 3, 2, 2, 1, 4, 4]
[4, 4, 3, 3, 2, 2, 2, 1, 4, 4]
[4, 4, 3, 3, 3, 2, 1, 1, 4, 4]
[4, 4, 3, 3, 3, 2, 2, 1, 3, 4]
[4, 4, 3, 3, 3, 2, 2, 1, 4, 4]
[5, 4, 3, 3, 3, 2, 2, 1, 3, 3]

Note: I accepted @Matt Timmermans answer as it provides the best time complexity n and nlogn. But the two other answer are amazing and good to understand and be able to implement the solution using dynamic programming or memorization. Personally I prefer the memorization version expected time complexity is snh where h is the max hunger level in the array.


Solution

  • The sum of the absolute differences only goes down when you reduce a local maximum.

    If you reduce a maximum on either end, the sum of differences goes down by one, like [3,2,1] -> [2,2,1].

    If you reduce a maximum in the middle, the sum of differences goes down by two, like [1,3,2] -> [1,2,2].

    If a maximum gets reduced, it may merge into another maximum that you can reduce, but the new maximum will never be cheaper or more cost effective. It can only get wider, like [1,3,2] -> [1,2,2].

    The optimal strategy is, therefore, just to repeatedly reduce the most cost-effective maximum, in terms of benefit/width, that you have enough sandwiches to reduce. benefit is 1 for maximums on the ends or 2 for maximums in the middle.

    Stop when you no longer have enough sandwiches to reduce the narrowest maximum.

    You can do this in O(n) time by finding all the maximums and keeping them in a priority queue to process them in the proper order as they are reduced.

    O(n log n) is easy. In order to make that O(n) bound, you'll need to use a counting-sort-type priority queue instead of a heap. You also need to be a little clever about keeping track of the regions of the array that are known to be at the same height so you can merge them in constant time.

    Here's an O(n) implementation in python

    def distribute(arr):
    
        foodLeft = arr[0]
        hungers = arr[1:]
    
        # For each element in hungers, calculate number of adjacent elements at same height
        spans = [1] * len(hungers)
        for i in range(1, len(hungers)):
            if hungers[i-1]==hungers[i]:
                spans[i] = spans[i-1]+1
        for i in range(len(hungers)-2, -1, -1):
            if hungers[i+1]==hungers[i]:
                spans[i] = spans[i+1]
    
        # spans are identified by their first element.  Only the counts and hungers on the edges need to be correct
    
        # if a span is a maximum, it's height.  Otherwise 0
        def maxHeight(left):
            ret = len(spans)
            if left > 0:
                ret = min(ret, hungers[left] - hungers[left-1])
            right = left + spans[left]-1
            if right < len(spans)-1:
                ret = min(ret, hungers[right] - hungers[right+1])
            return max(ret,0)
        
        # change the height of a span and return the maybe new span that it is a part of
        def reduce(left, h):
            right = left + spans[left] - 1
            hungers[left] -= h
            hungers[right] = hungers[left]
            if right < len(spans)-1 and hungers[right+1] == hungers[right]:
                # merge on the right
                w = spans[right+1]
                spans[right] = spans[right+1] = 0 # for debuggability
                right += w
            if left > 0 and hungers[left-1] == hungers[left]:
                # merge on left
                w = spans[left-1]
                spans[left] = spans[left-1] = 0 # for debuggability
                left -= w
            spans[left] = spans[right] = right - left + 1
            return left
        
        def isEdge(left):
            return left < 1 or left + spans[left] >= len(spans)
        
        # constant-time priority queue for non-edge spans
        # it's just a list of spans per width
        pq = [[] for _i in range(len(spans)+1)]
    
        # populate priority queue
        curspan = 0
        while curspan < len(spans):
            width = spans[curspan]
            if maxHeight(curspan) > 0 and not isEdge(curspan):
                pq[width].append(curspan)
            curspan += width
    
        # this will be True at the end if we can sacrifice one edge max selection to get one
        # mid max selection, which would produce one more point
        canBacktrack = False
        # process mid spans in order
        curpri = 1
        # while not all hungers are the same
        while spans[0] < len(spans):
    
            # find the best middle maximum
            bestmid = None
            midwidth = None
            if curpri < len(pq) and curpri <= foodLeft:
                if len(pq[curpri]) == 0:
                    curpri += 1
                    continue
                bestmid = pq[curpri][-1]
                midwidth = spans[bestmid]
    
            # find the best edge maximum
            bestedge = None
            edgewidth = None
            if maxHeight(0) > 0 and foodLeft >= spans[0]:
                bestedge = 0
                edgewidth = spans[0]
            r = len(spans)-spans[-1]
            if maxHeight(r) > 0 and foodLeft >= spans[r] and (bestedge == None or spans[r] < edgewidth):
                bestedge = r
                edgewidth = spans[r]
    
            # choose
            bestspan = None
            h = 0
            if bestedge == None:
                if bestmid == None:
                    break
                bestspan = bestmid
                bestwidth = midwidth
                canBacktrack = False
            elif bestmid == None:
                bestspan = bestedge
                bestwidth = edgewidth
                canBacktrack = False
            elif midwidth <= edgewidth*2:
                # mid maximum is more cost effective
                # OR choo
                bestspan = bestmid
                bestwidth = midwidth
                canBacktrack = False
            else:
                bestspan = bestedge
                bestwidth = edgewidth
                # tentative
                canBacktrack = True
            
            if bestspan == bestmid:
                # chose the middle span -- remove from pq
                pq[curpri].pop()
    
            # how much we can reduce this maxium by
            h = min(foodLeft//bestwidth, maxHeight(bestspan))
            foodLeft -= bestwidth*h
            canBacktrack = canBacktrack and foodLeft < midwidth and foodLeft + edgewidth >= midwidth
            bestspan = reduce(bestspan, h)
            if maxHeight(bestspan) > 0 and not isEdge(bestspan):
                pq[spans[bestspan]].append(bestspan)
        
        # finally, calculate the new total diffs
        totaldiff = 0
        curspan = spans[0]
        while curspan < len(spans):
            totaldiff += abs(hungers[curspan] - hungers[curspan-1])
            curspan += spans[curspan]
        if canBacktrack:
            totaldiff -= 1
        return totaldiff
    
    # test
    cases = [
        [8, 11, 14, 15, 16, 13, 2, 3],
        [7, 5, 4, 3, 4, 5, 2, 3, 1, 4, 5],
        [2, 4, 4, 3, 4, 5],
        [3, 3, 4, 4, 4, 3, 4],
        [4, 3, 4, 4, 4, 3, 5],
        [5, 3, 4, 4, 4, 3, 6],
        [3, 3, 4, 4, 3, 4, 5]
    ]
    for case in cases:
        print("{0}: {1}".format(case, distribute(case)))