pythonalgorithmdata-structurestime-complexityfactory

Find the minimum number of steps to half the sum of elements in a list where each step halves an item in the list in O(N)


I came across an interview question that went like this:

There are factories in an area which produce a pollutive gas and filters are to be installed at each factory to reduce the pollution. Each filter installed would half the pollution in that factory. Each factory can have multiple filters. There is a list of N integers representing the level of pollution in each of the N factories in the area. Find the minimum number of filters needed to half the overall pollution.

E.g. - Let [3, 5, 6, 1, 18] be the list of pollution levels in 5 factories

  • Overall pollution = 3+5+6+1+18 = 33 (target is 33/2 = 16.5)

  • Install a filter in factory given by index=4 -- > pollution levels will be [3, 5, 6, 1, 9]

  • Install a filter in factory given by index=4 -- > pollution levels will be [3, 5, 6, 1, 4.5]

  • Install a filter in factory given by index=2 -- > pollution levels will be [3, 5, 3, 1, 4.5]

  • Need 3 filters minimum to half the overall pollution.

N is an integer within the range [1....30,000]. Each element in the list is an integer within the range [0....70,000]

The solution I came up with for this was simple: Find the max in the list and half in every time until the sum is <=target

def solution(A):
    total = sum(A)
    target = total/2
    count = 0
    while total>target:
        count+=1
        max_p = max(A)
        total-= max_p/2
        A.remove(max_p)
        A.append(max_p/2)
    return count

This works well, except that the time complexity seems to be O(N^2). Can someone please suggest an approach to solve this with less time complexity (preferably O(N))?


Solution

  • Maybe you could utilize a max heap to retrieve the worst factory more efficiently than you are right now, i.e., using a heap would allow for an O(N log N) solution:

    import heapq
    
    
    def filters_required(factories: list[int]) -> int:
        """Returns minimum filters required to halve pollution."""
        current_pollution = sum(factories)
        goal_pollution = current_pollution / 2
        filters = 0
        factory_pollution_max_heap = [-p for p in factories]
        heapq.heapify(factory_pollution_max_heap)
        while current_pollution > goal_pollution:
            worst_factory = heapq.heappop(factory_pollution_max_heap)
            pollution = worst_factory / 2
            current_pollution += pollution  # Use += since pollution will be a negative number.
            heapq.heappush(factory_pollution_max_heap, pollution)
            print('DEBUG:', [-p for p in factory_pollution_max_heap], current_pollution)
            filters += 1
        return filters
    
    
    def main() -> None:
        print(f'{filters_required(factories=[3, 5, 6, 1, 18]) = }')
    
    
    if __name__ == '__main__':
        main()
    

    Output:

    DEBUG: [9.0, 6, 3, 1, 5] 24.0
    DEBUG: [6, 5, 3, 1, 4.5] 19.5
    DEBUG: [5, 4.5, 3, 1, 3.0] 16.5
    filters_required(factories=[3, 5, 6, 1, 18]) = 3