calgorithmdayofweekcorrectness

Correctness of Sakamoto's algorithm to find the day of week


I am using Sakamoto's algorithm to find out the day of week from a given date. Can anybody tell me the correctness of this algorithm? I just want this from 2000 to 2099.

The algorithm from Wikipedia is given for reference.

int dow(int y, int m, int d)
{
   static int t[] = {0, 3, 2, 5, 0, 3, 5, 1, 4, 6, 2, 4};
   y -= m < 3;
   return (y + y/4 - y/100 + y/400 + t[m-1] + d) % 7;
}

Solution

  • Well, you can tell just by looking at it that it is correct... Assuming that the t[] array is correct, which you can verify with just 12 spot checks (one for each month using any day/year).

    The y -= m < 3 is a nice trick. It creates a "virtual year" that starts on March 1 and ends on February 28 (or 29), putting the extra day (if any) at the end of the year; or rather, at the end of the previous year. So for example, virtual year 2011 began on Mar 1 and will end on February 29, while virtual year 2012 will begin on March 1 and end on the following February 28.

    By putting the added day for leap years at the end of the virtual year, the rest of the expression is massively simplified.

    Let's look at the sum:

    (y + y/4 - y/100 + y/400 + t[m-1] + d) % 7
    

    There are 365 days in a normal year. That is 52 weeks plus 1 day. So the day of the week shifts by one day per year, in general. That is what the y term is contributing; it adds one to the day for each year.

    But every four years is a leap year. Those contribute an extra day every four years. Thanks to the use of virtual years, we can just add y/4 to the sum to count how many leap days happen in y years. (Note that this formula assumes integer division rounds down.)

    But that is not quite right, because every 100 years is not a leap year. So we have to subtract off y/100.

    Except that every 400 years is a leap year again. So we have to add y/400.

    Finally we just add the day of the month d and an offset from a table that depends on the month (because the month boundaries within the year are fairly arbitrary).

    Then take the whole thing mod 7 since that is how long a week is.

    (If weeks were eight days, for example, what would change in this formula? Well, it would be mod 8, obviously. Also the y would need to be 5*y, because 365 % 8 == 5. Also the month table t[] would need adjusting. That's it.)

    Incidentally, Wikipedia's statement that the calendar is "good until 9999" is totally arbitrary. This formula is good for however long we stick with the Gregorian calendar, whether that is 10 years, 100 years, 1000 years, or 1 million years.

    [edit]

    The above argument is essentially a proof by induction. That is, assuming that the formula works for a particular (y,m,d), you prove that it works for (y+1,m,d) and (y,m,d+1). (Where y is a "virtual year" starting March 1.) So the key question is, does the sum change by the correct amount as you move from one year to the next? With knowledge of the leap year rules, and with the "virtual year" having the extra day at year end, it trivially does.