javaalgorithmmath

group date ranges into buckets


I have a problem where I have a date range and I want to split the date range into chunks (quantity provided by user). Each chunk must be a continuous range of whole months. The longest chunk must be equal to or one month longer than the shortest.

Date ranges are whole months, as well:

You can assume that The input range will be large enough that each chunk can have at least one whole month.

For example, consider the trivial case of a date range 1/1/2000 through 8/31/2000, 8 chunks requested. Then each chunk would get a full month.

An easier way to think of this problem is as follows Consider a list of numbers from 1-15 and we want to split them into 8 chunks possible combinations are

(1),(2),(3),(4),(5),(6),(7),(8,9,10,11,12,13,14,15) -> satisfies only one constraints of using up all the chunks
(1,9),(2,10), (3,11), (4,12), (5,13), (6,14), (7,15), (8) ---> satisfies only 1 constraint of minimizing the difference between maximum number and minimum numbers in a chunk.

(1,2), (3,4), (5,6), (7,8) (9,10), (11,12) (13,14), 15  ---> correct

I have considered joda time as the date library.

This is not a homework problem. I am trying to parallelize a query which takes date ranges as an input. The chunks are meant to be cores, and I want to run the query for subsequent date ranges across a core.


Solution

  • This isn't a hard problem. There's a simple construction to do the job. First of all, note that you don't care about days, merely full months.

    With these parameters, the allocation is simple: the first extra chunks get an extra month each, and will be of size min_size+1. The remaining chunk-extra chunks get min_size months each.

    For instance, let's take a range 1/1/2017 - 5/31/2018, and 4 chunks.

    m_range = 12*(2018-2017) + (5-1) + 1 = 12 + 4 + 1 = 17
    min_size = floor(m_range/chunk) = floor(17/4) = 4
    extra = m_range mod chunk = 1
    

    So, you give the first chunk 5 months; the other 3 chunks get 4 months each.

    The individual date manipulations are left as an exercise for the student. :-)