javaalgorithmdata-structuresheapclrs

Deleting a random element from a heap


I read a few articles which said that, in a heap, only the root element can be deleted. However, why can't we delete elements, using the below approach?

  1. Find the index of the key element you want to delete
  2. Swap this element with the last element, so the key becomes the last element
  3. Re-heapify starting at the original index of the key(which you have now swapped with the last element)
  4. Reduce size of heap by 1

So, the code would look like

static void delete(int[] a, int key)
    {
        int i = 0;
        for(i = 0;i<a.length;i++)
        {
            if(a[i] == key)
                break;
        }
        swap(a, i, a.length-1);

        max_heapify_forheapsort(a, i, a.length-2);
    }

static void max_heapify_forheapsort(int[] a, int i, int limit)
    {
        int left = (2*i)+1;
        int right = (2*i) + 2;
        int next = i;

        if(left <= limit && a[left] > a[i])
            next = left;
        if(right <= limit && a[right] > a[next] )
            next = right;

        if(next!=i)
        {
            swap(a, next, i);
            max_heapify_forheapsort(a, next, limit);
        }
    }

Solution

  • It's certainly possible to delete elements from a heap with sift-up or sift-down operations as you propose. (It's similarly possible to change the key of any item in the heap and use sift-up or -down operations to restore the heap property.)

    The tricky bit is what you stated quickly at the outset: "Find the index." A naive implementation requires O(n) to do this, and that kills heap efficiency. What people mean when they say "you can't delete" is that with a naive implementation you can't delete efficiently.

    Fortunately it's not hard to make finding the index O(1). Just maintain a "reverse map" in parallel with the heap array, e.g. a hash map from objects to heap array indices. It's easy to update this map without adding asymptotic complexity to any of the heap operations, and with it - as you point out - delete has complexity O(log n). The sift up/down dominates the map lookup to find the index.

    If you're interested in a tested implementation, here's one where the heap contains indices into another array of objects. Therefore the reverse map is implemented as an array q->locs[k], which is the index of the heap element that contains k (which is an index into the object table).

    Many problems in computer science can be solved by adding a level of indirection!

    Edit

    Since OP asked about why a sift up might be required to delete, consider the min heap

              1
        20          2 
     30    40    3     4
    

    We want to delete 30, so move 4 (the last array element) into its position:

              1
        20          2 
     4     40    3 
    

    Clearly a sift up is required.