I am trying to solve a problem that is defined as follows:
A candy is represented by a number from 1 to 1e6. A person is said to have a set of k candies if he has candy 1 to k.
E.g. If a person bought candies 1, 2 and 6, then he has a set of 2 candies
Given 2 types of operations: Eat x and Buy x where x represents the candy number. Buy x will increase the quantity of x by 1 only. Eat x will decrease the quantity of x by 1 only.
For each operation, answer the question, what is the size of the set of candies that I have now?
I am trying to find the most efficient way to do this. The solution that I have thought of is described below:
Let count[i] define the array of size 1 - N where N is the greatest possible candy number. count[i] stores the number of candies with number i that I have so far.
Let Fenwick[i] array of size 1 - N where N is the greatest possible candy number. This array is for constructing a fenwick tree to store the cumulative sums of candies in my collection. This cumulative sum does not use the count array. The cumulative sum counts quantities of 1s (each 1 indicates that a candy x is present in my collection). e.g. if I have a set of 5 candies, then the cumulative sum from 1 to 5 is 5. if there is a set of 10 candies, then the cumulative sum from 1 to 10 is 10...
For a buy operation, if candy x was not already in my collection, add 1 to cumulative sum starting from index x (this is handled by the fenwick tree). Else, I will just execute count[x]++
For a eat operation, execute count[x]--. If count[x] is 0 now, then I decrement 1 from cumulative sum starting from index x (this is handled by the fenwick tree).
Now that settles the portion of insertion and deletion. The difficult part is how to get the size of the set of candies that is in the collection currently.
I tried to query the Fenwick tree for the largest index, i, for which the cumulative sum from 1 to i is equal to i while incrementing my query index in powers of 2 every time.
I take the largest index that is a valid set of candies, j, and the smallest index that is an invalid collection of candies, k. Then loop from from j to k, querying the fenwick tree on each iteration. Once the loop hits an invalid collection, break and output the answer.
It seems to me that this would work. However, this is certainly not an efficient method. Is anyone able to enlighten me of a better solution? Thanks in advance.
edit (solution):
My approach for insertion and deletion was correct. Just that I was searching for the collection of candies in an incorrect way. In this case, we want the largest number x, where query(x) = x (query(x) gives the cumulative sum from 1 to x). So we can use binary search to find the largest valid value of x (query(x) = x). To achieve this, we just need to keep an additional variable to track the last value of x where query(x) gives a valid collection.
Complexity of solution: O(log^2(N))
This is typically a binary tree structure.
For simplicity, let us say that the indices of candies range from 0
to 2^k - 1
for some integer k
. Then at every moment the status is representated by 16
numbers, c[0]
to c[2^k - 1]
, where c[i]
is the number of candy i
.
We construct a binary tree as follows: the root node P(0, 2^k)
represents the whole interval [0, 2^k)
; For every node P(a, b)
such that b - a > 1
, construct two subnodes P(a, (a + b)/2)
and P((a + b)/2, b)
.
Let p(a, b)
be the minimum of c[i]
for i
in the interval [a, b)
. Apparently we have:
p(a, a + 1) = c[a]
;
p(a, b) = min{p(a, (a + b)/2), p((a + b)/2, b)}
if b - a > 1
.
Having constructed this data structure, for every operation (plus one or minus one) we can update the data in O(k)
steps, from bottom to top. Moreover, finding the size of the set of candies can also be done in O(k)
steps.
Example of the data structure:
Let us look at the case k = 3
, so that there are c[0]
to c[7]
. For exampe:
c[0 .. 7] = {1, 3, 0, 4, 3, 2, 8, 1}
The tree structure then looks like this:
p(0, 8) = 0
|- p(0, 4) = 0
| |- p(0, 2) = 1
| | |- p(0, 1) = 1
| | |_ p(1, 2) = 3
| |_ p(2, 4) = 0
| |- p(2, 3) = 0
| |_ p(3, 4) = 4
|_ p(4, 8) = 1
|- p(4, 6) = 2
| |- p(4, 5) = 3
| |_ p(5, 6) = 2
|_ p(6, 8) = 1
|- p(6, 7) = 8
|_ p(7, 8) = 1
Now let's say that we add 1
to the number c[2]
, which beomes 1
, then we only need to update the numbers p(2, 3)
, p(2, 4)
, p(0, 4)
, p(0, 8)
.