I have a problem where I have a date range and I want to split the date range into chunks (quantity provided by user). Each chunk must be a continuous range of whole months. The longest chunk must be equal to or one month longer than the shortest.
Date ranges are whole months, as well:
You can assume that The input range will be large enough that each chunk can have at least one whole month.
For example, consider the trivial case of a date range 1/1/2000 through 8/31/2000, 8 chunks requested. Then each chunk would get a full month.
An easier way to think of this problem is as follows Consider a list of numbers from 1-15 and we want to split them into 8 chunks possible combinations are
(1),(2),(3),(4),(5),(6),(7),(8,9,10,11,12,13,14,15) -> satisfies only one constraints of using up all the chunks
(1,9),(2,10), (3,11), (4,12), (5,13), (6,14), (7,15), (8) ---> satisfies only 1 constraint of minimizing the difference between maximum number and minimum numbers in a chunk.
(1,2), (3,4), (5,6), (7,8) (9,10), (11,12) (13,14), 15 ---> correct
I have considered joda time as the date library.
This is not a homework problem. I am trying to parallelize a query which takes date ranges as an input. The chunks are meant to be cores, and I want to run the query for subsequent date ranges across a core.
This isn't a hard problem. There's a simple construction to do the job. First of all, note that you don't care about days, merely full months.
m1/d1/y1
(start) and m2/d2/y2
(end). m_range = 12*(y2-y1) + m2-m1 + 1
.chunk
. If m_range < chunk
, you have invalid input.min_size = floor(m_range/chunk)
(truncated division). The max size is one more.extra = m_range mod chunk
.With these parameters, the allocation is simple: the first extra
chunks get an extra month each, and will be of size min_size+1
. The remaining chunk-extra
chunks get min_size
months each.
For instance, let's take a range 1/1/2017 - 5/31/2018, and 4 chunks.
m_range = 12*(2018-2017) + (5-1) + 1 = 12 + 4 + 1 = 17
min_size = floor(m_range/chunk) = floor(17/4) = 4
extra = m_range mod chunk = 1
So, you give the first chunk 5 months; the other 3 chunks get 4 months each.
The individual date manipulations are left as an exercise for the student. :-)