databasepostgresqltimezonetime-and-attendance

Should I store UTC timestamps or localtime for shifts


I am working on a function that will work out how many workers is on duty and clocked in (or not clocked in) for their shift.

The workers "clock in" and "out" and the time stamps of these events will be stored in UTC format since that seems to be the recommendation that comes up the most.

But the shifts start at fixed local time, so for example one shift will always start at 07:00, regardless of daylight savings, and end at for example 14:00.

The shift workers are "allocated" to shifts.

The first requirement is to be able to know how many of the workers are on duty (clocked in) for their shift "at this point in time" - A kind of status check.

The second requirement is to be able to get a report of a day in the past for example for an individual worker (was the worker on duty for the whole shift, was he late, did they leave late and require overtime to be paid, etc)

So not assuming that everything will always be in only one timezone, the idea is to store the timezone in the database along with the definition of the shifts (The shift record may contain a local start time, end time, and the local time zone "name".)

Question 1: What format is recommended for storing the local timezone? Is there a way in which I can store the start/end times of shifts that allows me to write an SQL query that can compare those times with a supplied UTC time.

Question 2: Is there another suggestion? Should I just store the clock in/out times as Local to be the same as the shift start/end times? Or should I do some magic and store everything as UTC times ... But how to work out the current Local time start time of a shift if it is stored as "2:00 UTC" .... converting that to Local Time needs to always be the same, eg 7:00 local time, regardless of DST....


Solution

  • First, understand that date-time is barely touched on by the SQL standard. Data types, definitions, and behavior vary widely across various database products.

    🐘 📅 🕙

    Fortunately for you, Postgres has exceptionally rich handling of date-time data types and date-time functions. But you must study the documentation carefully and experiment so you understand behavior.

    07:00, regardless of daylight savings

    Actually, that would not be regardless but respecting Daylight Saving Time (DST). DST is changing the meaning of 7 AM by sliding it later or earlier by an hour. Respecting DST correctly is the heart of your problem.

    For the sake of other readers, a word about terminology: The “localtime” word in the context of the Question means a date and/or a time-of-day without regard for time zone or offset-from-UTC.

    As such, a local-date-time does not represent a point on the timeline but rather a rough idea of possible points. In Auckland NZ, 7 AM arrives much earlier than in Paris FR. And in turn, 7 AM in Paris FR happens much sooner than 7 AM in Montréal CA. So without a time zone, a “local” value has no meaning. In Postgres, these “local” types are:

    An actual moment on the timeline requires a zone or offset. Let’s call these “zoned” types, for lack of a better word. In Postgres that would be:

    Very important to understand that Postgres never saves any time zone information. Despite the “with time zone” name, the zone is not saved as part of your data. Instead, “with time zone” means “with respect for time zone” in that Postgres adjusts the input value to UTC.

    Question 1: What format is recommended for storing the local timezone?

    So, depending on your business needs and rules, you may want to separately record the zone/offset separately, in addition to the date-time value. Neither Postgres nor the SQL standard specify a data type for zone or offset. So I suggest storing as text.

    More terminology: An offset-from-UTC is a number of hours and minutes and seconds ahead of, or behind, UTC. A time zone is an offset plus a set of rules for handling anomalies such as Daylight Saving Time (DST). So always better to use a time zone when you know it.

    To answer the Question:
    You need both “local” and “zoned” types for your solution.

    To record the definition of a shift, you want a “local” time-of-day, the time without time zone. When you record that shift should normally start at 7 AM, you do not want Postgres to alter or adjust that value.

    To record the concept of a particular shift, you would record (a) the start time as time without time zone, (b) the date as date. By applying an offset or zone, you can determine a UTC value. You may want to also/instead record the UTC value itself. But beware of doing so far into the future, as politicians everywhere are quite fond of redefining time zones with little advance notice.

    To record the moment a work actually clocked-in, you want the UTC moment, the timestamp with time zone type. You can input a zoned value if need be, and Postgres will adjust into UTC. As mentioned in the Question, when working with actual moments on the timeline, it is almost always best to work and store data in UTC.

    For both both of the types, you may or may not wish to additionally record the intended time zone as well.

    A Postgres session has a default time zone. I suggest you never rely on that. Better to always specify the intended time zone in your SQL code and/or your input data. When manually perusing data, you may find it handy to set the session default to UTC or to a zone, but I would not do that in code.

    Is there a way in which I can store the start/end times of shifts that allows me to write an SQL query that can compare those times with a supplied UTC time.

    As mentioned above, you would record the general concept of a shift as a “local”. As in, “In 2015 the Dusseldorf and Detroit factories started at 6 AM while the Delhi factor started at 7 AM, but in 2016 all three factories start at 7 AM”. To record any one actual shift, record in UTC though you may also want to record the “local” values for readability by humans.

    Should I just store the clock in/out times as Local to be the same as the shift start/end times?

    No, no, certainly not. Any actual points on the time line, real moments, should be recorded in UTC. Use timestamp with time zone and let Postgres adjust inputs to UTC as needed. Though generally I suggest your app’s programming be in UTC beforehand.

    Or should I do some magic and store everything as UTC times

    No magic needed, just consistent handling of your data to always include time zone (or offset) data and adjust into UTC. Get this straight in your app programming as well as the database. For example, in Java, hand off objects rather than mere strings for date-time values. With JDBC 4.2 the database can exchange OffsetDateTime (and, optionally, Instant or ZonedDateTime) objects. (Avoid the troublesome java.util.Date & .Calendar classes, now legacy.)

    Table of date-time types in Java (both legacy and modern) and in standard SQL

    how to work out the current Local time start time of a shift if it is stored as "2:00 UTC"

    If you have a date and a time-of-day in UTC, you can always apply a time zone (or offset) to see the “local” value.

    Date-time handling is tricky slippery stuff. So think it through, give yourself time to learn, and practice, practice, practice. Tips: (a) Learn 24-hour time, (b) When at work programming, think in UTC, keep a UTC clock on your desk, and forget about your own personal local time zone. Thinking primarily in your own local time zone, and constantly translating back-and-forth with UTC, will drive you crazy and lead to errors.