c++performancealgorithmdata-structuressimulation

How to repeatedly insert elements into a sorted list fast


I do not have formal CS training, so bear with me.

I need to do a simulation, which can abstracted away to the following (omitting the details):

We have a list of real numbers representing the times of events. In each step, we

  1. remove the first event, and
  2. as a result of "processing" it, a few other events may get inserted into the list at a strictly later time

and repeat this many times.

Questions

What data structure / algorithm can I use to implement this as efficiently as possible? I need to increase the number of events/numbers in the list significantly. The priority is to make this as fast as possible for a long list.

Since I'm doing this in C++, what data structures are already available in the STL or boost that will make it simple to implement this?


More details:

The number of events in the list is variable, but it's guaranteed to be between n and 2*n where n is some simulation parameter. While the event times are increasing, the time-difference of the latest and earliest events is also guaranteed to be less than a constant T. Finally, I suspect that the density of events in time, while not constant, also has an upper and lower bound (i.e. all the events will never be strongly clustered around a single point in time)

Efforts so far:

As the title of the question says, I was thinking of using a sorted list of numbers. If I use a linked list for constant time insertion, then I have trouble finding the position where to insert new events in a fast (sublinear) way.

Right now I am using an approximation where I divide time into buckets, and keep track of how many event are there in each bucket. Then process the buckets one-by-one as time "passes", always adding a new bucket at the end when removing one from the front, thus keeping the number of buckets constant. This is fast, but only an approximation.


Solution

  • A min-heap might suit your needs. There's an explanation here and I think STL provides the priority_queue for you.

    Insertion time is O(log N), removal is O(log N)