c++algorithmheap

How to create an auxiliary data structure to keep track of heap indices in a minheap for the decrease_key operation in c++


I think this is probably a trivial problem to solve but I have been struggling with this for past few days.

I have the following vector: v = [7,3,16,4,2,1]. I was able to implement with some help from google simple minheap algorithm to get the smallest element in each iteration. After extraction of the minimum element, I need to decrease the values of some of the elements and then bubble them up.

The issue I am having is that I want find the elements whose value has to be reduced in the heap in constant time, then reduce that value and then bubble it up.

After the heapify operation, the heap_vector v_h looks like this: v_h = [1,2,7,4,3,16]. When I remove the min element 1, then the heap vector becomes, [2,3,7,4,16]. But before we do the swap and bubble up, say I want to change the values of 7 to 4, 16 to 4 and 4 to 3.5 . But I am not sure where they will be in the heap. The indices of values of the elements that have to be decreased will be given with respect to the original vector v. I figured out that I need to have an auxiliary data structure that can keep track of the heap indices in relation to the original order of the elements (the heap index vector should look like h_iv = [2,4,5,3,1,0] after all the elements have been inserted into the minheap. And whenever an element is deleted from the minheap, the heap_index should be -1. I created a vector to try to update the heap indices whenever there is a change but I am unable to do it.

I am pasting my work here and also at https://onlinegdb.com/SJR4LqQO4 Some of the work I had tried is commented out. I am unable to map the heap indices when there is a swap in the bubble up or bubble down operations. I will be very grateful to anyone who can lead me in a direction to solve my problem. Please also let me know if I have to rethink some of my logic.

The .hpp file

#ifndef minheap_hpp
#define minheap_hpp

#include <stdio.h>
// #include "helper.h"
#include <vector>

class minheap
{
public:
    std::vector<int> vect;
    std::vector<int> heap_index;
    void bubble_down(int index);
    void bubble_up(int index);
    void Heapify();

public:
    minheap(const std::vector<int>& input_vector);
    minheap();

    void insert(int value);
    int  get_min();
    void delete_min();
    void print_heap_vector();
};


#endif /* minheap_hpp */

The .cpp file

#include "minheap.hpp"

minheap::minheap(const std::vector<int>& input_vector) : vect(input_vector)
{
    Heapify();
}

void minheap::Heapify()
{
    int length = static_cast<int>(vect.size());
//    auto start = 0;
//    for (auto i = 0; i < vect.size(); i++){
//        heap_index.push_back(start);
//        start++;
//    }
    for(int i=length/2-1; i>=0; --i)
    {
        bubble_down(i);
    }
}


void minheap::bubble_down(int index)
{
    int length = static_cast<int>(vect.size());
    int leftChildIndex = 2*index + 1;
    int rightChildIndex = 2*index + 2;

    if(leftChildIndex >= length){
        return;
    }

    int minIndex = index;

    if(vect[index] > vect[leftChildIndex])
    {
        minIndex = leftChildIndex;
    }

    if((rightChildIndex < length) && (vect[minIndex] > vect[rightChildIndex]))
    {
        minIndex = rightChildIndex;
    }

    if(minIndex != index)
    {
        std::swap(vect[index], vect[minIndex]);
//        std::cout << "swap " << index << " - " << minIndex << "\n";
//        auto a = heap_index[heap_index[index]];
//        auto b = heap_index[heap_index[minIndex]];
//        heap_index[a] = b;
//        heap_index[b] = a;
//        print_vector(heap_index);
        bubble_down(minIndex);
    }
}


void minheap::bubble_up(int index)
{
    if(index == 0)
        return;

    int par_index = (index-1)/2;

    if(vect[par_index] > vect[index])
    {
        std::swap(vect[index], vect[par_index]);
        bubble_up(par_index);
    }
}

void minheap::insert(int value)
{
    int length = static_cast<int>(vect.size());
    vect.push_back(value);
    bubble_up(length);
}

int minheap::get_min()
{
    return vect[0];
}

void minheap::delete_min()
{
    int length = static_cast<int>(vect.size());

    if(length == 0)
    {
        return;
    }
    vect[0] = vect[length-1];
    vect.pop_back();
    bubble_down(0);
}


void minheap::print_heap_vector(){
    // print_vector(vect);
}

and the main file

#include <iostream>
#include <iostream>
#include "minheap.hpp"


int main(int argc, const char * argv[]) {


    std::vector<int> vec {7, 3, 16, 4, 2, 1};

    minheap mh(vec);

    // mh.print_heap_vector();

    for(int i=0; i<3; ++i)
    {
        auto a = mh.get_min();
        mh.delete_min();
        // mh.print_heap_vector();
        std::cout << a << "\n";

    }
//    std::cout << "\n";


    return 0;
}

Solution

  • "I want to change the values of 7 to 4, 16 to 4 and 4 to 3.5 . But I am not sure where they will be in the heap. The indices of values of the elements that have to be decreased will be given with respect to the original vector v. ... Please also let me know if I have to rethink some of my logic."

    Rather than manipulate the values inside the heap, I would suggest keeping the values that need changing inside a vector (possibly v itself). The heap could be based on elements that are a struct (or class) that holds an index into the corresponding position in the vector with the values, rather than hold the (changing) value itself.

    The struct (or class) would implement an operator< function that compares the values retrieved from the two vector locations for the respective index values. So, instead of storing the comparison value in the heap elements and comparing a < b, you would store index positions i and j and so on and compare v[i] < v[j] for the purpose of heap ordering.

    In this way, the positions of the numerical values you need to update will never change from their original positions. The position information will never go stale (as I understand it from your description).

    Of course, when you make changes to those stored values in the vector, that could easily invalidate any ordering that might have existed in the heap itself. As I understand your description, that much was necessarily true in any case. Therefore, depending on how you change the values, you might need to do a fresh make_heap to restore proper heap ordering. (That isn't clear, since it depends on whether your intended changes violate heap assumptions, but it would be a safe thing to assume unless there are strong assurances otherwise.)

    I think the rest is pretty straight forward. You can still operate the heap as you intended before. For ease you might even give the struct (or class) a lookup function to return the current value at it's corresponding position in the vector, if you need that (rather than the index) as you pop out minimum values.

    p.s. Here is a variation on the same idea.

    In the original version above, one would likely need to also store a pointer to the location of the vector that held the vector of values, possibly as a shared static pointer of that struct (or class) so that all the members could dereference the pointer to that vector in combination with the index values to look up the particular member associated with that element.

    If you prefer, instead of storing that shared vector pointer and an index in each member, each struct (or class) instance could more simply store a pointer (or iterator) directly to the corresponding value's location. If the values are integers, the heap element struct's member value could be int pointer. While each pointer might be larger than an index value, this does have the advantage that it eliminates any assumption about the data structure that holds the compared values and it is even simpler/faster to dereference vs. lookup with an index into the vector. (Both are constant time.)

    One caution: In this alternate approach, the pointer values would be invalidated if you were to cause the vector's storage positions to change, e.g. by pushing in new values and expanding it in a way that forces it to reallocate it's space. I'm assuming you only need to change values, not expand the number of values after you've begun to use the heap. But if you did need to do that, that would be one reason to prefer index values, since they remain valid after expanding the vector (unlike pointers).

    p.p.s. This technique is also valuable when the objects that you want to compare in the heap are large. Rather than have the heap perform many copy operations on large objects as it reorders the positions of the heap elements, by storing only pointers (or index values) the copying is much more efficient. In fact, this makes it possible to use heaps on objects that you might not want to copy at all.

    Here is a quick idea of one version of the comparison function (with some class context now added).

    class YourHeapElementClassName
    {
    public:
    
    // constructor
    explicit YourHeapElementClassName(theTypeOfYourComparableValueOrObject & val)
    : m_valPointer(&val)
    {
    }
    
    bool operator<(const YourHeapElementClassName & other) const
    {
        return *m_valPointer < *(other.m_valPointer);
    }
    
    ...
    
    private:
    
    theTypeOfYourComparableValueOrObject * m_valPointer;
    
    }; // YourHeapElementClassName
    
    // and later instead of making a heap of int or double,
    // you make a heap of YourHeapElementClassName objects
    // that you initialize so each points to a value in v
    // by using the constructor above with each v member.
    // If you (probably) don't need to change the v values
    // through these heap objects, the member value could be
    // a pointer to a const value and the constructor could
    // have a const reference argument for the original value.
    

    If you had need to do this with different types of values or objects, the pointer approach could be implemented with a template that generalizes on the type of value or object and holds a pointer to that general type.