I am writing an algorithm that involves taking a set of numbers and putting them into buckets. Can you help me implement these two simple methods? Let me know if I need to explain more.
// return a vector where each element represents the
// size of the range of numbers in the corresponding bucket
// buckets should be equal in size +/- 1
// doesn't matter where the bigger/smaller buckets are
vector<int> makeBuckets(int max, int numberOfBuckets);
// return which bucket n belongs in
int whichBucket(int max, int numberOfBuckets, int n);
Example output
makeBuckets(10, 3) == { 3, 3, 4 }; // bucket ranges: (0, 2), (3, 5), (6, 9)
whichBucket(10, 3, 0) == 0;
whichBucket(10, 3, 1) == 0;
whichBucket(10, 3, 2) == 0;
whichBucket(10, 3, 3) == 1;
whichBucket(10, 3, 4) == 1;
whichBucket(10, 3, 5) == 1;
whichBucket(10, 3, 6) == 2;
whichBucket(10, 3, 7) == 2;
whichBucket(10, 3, 8) == 2;
whichBucket(10, 3, 9) == 2;
Let's say you need to divide a range [0,n] into k buckets.
How to best create buckets depends on the size of n. For example, if you have a small n, you could just have an array of size n that stores the bucket index for each number. In the loop above, you would fill up that array. Then the query whichBucket() would run in O(1).
If n is large, however, this is impractical. In this case, you would do your bucket sorting completely implicitely. That means, for each incoming query, you can directly compute the corresponding bucket index using e and o.