This question is hinted at in this one, but the answer to that question doesn't answer this question at all, and I've conflicting suggestions and hints scattered around.
My problem is relatively simple, but in digging into it, I'm getting a bit tripped up.
Suppose I have a string in a format like this: 2023-06-07 03:04:56 -0700
The goal is to normalize this into an epoch timestamp (time_t
in C). I assumed this would be simple enough, but it seems not. The gotcha here seems to be the -0700
at the end.
It seems that strptime(3)
ignores the %z
modified, possibly, maybe (again, I've conflicting reports as to how this is used, in different implementations, etc.). FWIW, I'm using Linux/glibc so I more care about whether it works there, not that it's not in the C standard.
Playing around with it a little bit, it seemed to me like strptime
does ignore the timezone offset. The hour in the struct tm
is simply the hour in the string. The hour isn't modified based on the timezone offset at all. Supposedly that's what the non-standard tm_gmoff
member is for, but I seem to just get a gigantic value when reading that that is definitely much larger than any UTC offset in seconds, so I'm not sure what to make of that either.
As an example:
#define _XOPEN_SOURCE
#include <stdio.h>
#include <string.h>
#include <time.h>
int main()
{
struct tm tm;
time_t epoch;
char buf[40];
strcpy(buf, "2023-06-07 03:04:56 -0700");
memset(&tm, 0, sizeof(tm));
strptime(buf, "%Y-%m-%d %H:%M:%S %z", &tm);
printf("Parsed datetime %s (hour %d, offset %lu)\n", buf, tm.tm_hour, tm.__tm_gmtoff);
tm.tm_isdst = -1;
setenv("TZ", "US/Eastern");
epoch = mktime(&tm);
printf("Parsed datetime -> epoch %lu\n", epoch); // 7:04AM UTC
epoch = timegm(&tm);
printf("Parsed datetime -> epoch %lu\n", epoch); // 3:04AM UTC
return 0;
}
when run on https://www.onlinegdb.com/online_c_compiler, gives:
Parsed datetime 2023-06-07 03:04:56 -0700 (hour 3, offset 18446744073709526416)
Parsed datetime -> epoch 1686121496
Parsed datetime -> epoch 1686107096
Note that -0700
offset in the string is arbitrary, and the local time zone on the system is also arbitrary. For example, -0700
is Pacific Time, but the system could be in Eastern Time, which is actually completely irrelevant to the problem (i.e. the local time zone should not be used in the conversion, since it's irrelevant - the time zone of the offset should be used instead - and importantly, the local time zone should not mess up the answer).
Above, the correct answer is 10:04AM UTC (what the string obviously should convert to). Blindly using mktime
gives the wrong answer, and timegm
is even more off. The problem seems to be that the offset is not taken into account here. The second answer using timegm
would be correct, if the struct tm
had +7 hours added to it for the offset, or if timegm
added +7 hours to the answer based on something in the struct tm
, such as tm_gmtoff. But neither of those things seems to happen.
Short of writing a manual function to parse the %z
in the time string and manually add this offset to the time_t
, is there a better "builtin" way of doing this with standard functions? (Portability isn't super important here, as long as it works in glibc
.) Given this would seem to be a very common type of conversion, I'm thinking there must be a way to do this properly without manually doing calculations, using gmtime
. I thought this was what tm_gmtoff
was for but it seems otherwise - am I missing something here?
A few issues ...
__tm_gmtoff
is signed [so it printed incorrectly] with %lu
__tm_gmtoff
is set correctly (e.g. -7 * 3600
).setenv("TZ",...)
does not work. It uses the local timezone set by the system. (e.g. -0700 is US/Pacific(?) DST but I got -0400 (US/Eastern DST).timegm
will ignore __tm_gmtoff
tm_gmtoff
[AFAICT].timegm
and apply tm_gmtoff
manually to get the correct timezone.Here is the somewhat corrected code (in stages). It may still be broken. Important to read the comments:
//#define _XOPEN_SOURCE
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
void
sepline(const char *tag)
{
printf("\n");
for (int col = 1; col <= 80; ++col)
putchar('-');
printf("\n");
printf("%s:\n",tag);
printf("\n");
}
void
tmshow(const struct tm *tm,const char *tag)
{
printf("TMX: %4.4d/%2.2d/%2.2d-%2.2d:%2.2d:%2.2d (%ld/%ld) (from %s)\n",
tm->tm_year + 1900,tm->tm_mon + 1,tm->tm_mday,
tm->tm_hour,tm->tm_min,tm->tm_sec,tm->tm_gmtoff,tm->tm_gmtoff / 3600,
tag);
}
void
todshow(time_t tod,int gmtflg,const char *tag)
{
struct tm tm;
if (gmtflg)
gmtime_r(&tod,&tm);
else
localtime_r(&tod,&tm);
printf("\n");
printf("TOD: %ld (from %s)\n",tod,tag);
tmshow(&tm,tag);
}
void
orig(const char *buf)
{
struct tm tm;
memset(&tm, 0, sizeof(tm));
sepline("ORIG");
printf("BUF: %s\n",buf);
strptime(buf, "%Y-%m-%d %H:%M:%S %z", &tm);
printf("Parsed datetime %s (hour %d, offset %lu)\n",
buf, tm.tm_hour, tm.tm_gmtoff);
tm.tm_isdst = -1;
setenv("TZ", "US/Eastern", 1);
tmshow(&tm,"strptime");
time_t epoch_mktime = mktime(&tm);
printf("Parsed mktime -> epoch %lu\n", epoch_mktime); // 7:04AM UTC
time_t epoch_timegm = timegm(&tm);
printf("Parsed timegm -> epoch %lu\n", epoch_timegm); // 3:04AM UTC
time_t diff = epoch_mktime - epoch_timegm;
printf("diff = %ld (%.3f)\n",diff,diff / 3600.0);
}
void
fix1(const char *buf)
{
struct tm tm;
memset(&tm, 0, sizeof(tm));
sepline("FIX1");
printf("BUF: %s\n",buf);
strptime(buf, "%Y-%m-%d %H:%M:%S %z", &tm);
#if 0
printf("Parsed datetime %s (hour %d, offset %ld/%ld)\n",
buf, tm.tm_hour, tm.tm_gmtoff, tm.tm_gmtoff / 3600);
#endif
tm.tm_isdst = -1;
//setenv("TZ", "US/Eastern", 1);
unsetenv("TZ");
tmshow(&tm,"strptime");
time_t epoch_mktime = mktime(&tm);
//printf("Parsed mktime -> epoch %lu\n", epoch_mktime); // 7:04AM UTC
todshow(epoch_mktime,0,"mktime");
time_t epoch_timegm = timegm(&tm);
//printf("Parsed timegm -> epoch %lu\n", epoch_timegm); // 3:04AM UTC
todshow(epoch_timegm,1,"timegm");
time_t diff = epoch_mktime - epoch_timegm;
printf("diff = %ld (%.3f)\n",diff,diff / 3600.0);
}
void
fix2(const char *buf)
{
struct tm tm;
memset(&tm, 0, sizeof(tm));
sepline("FIX2");
printf("BUF: %s\n",buf);
strptime(buf, "%Y-%m-%d %H:%M:%S %z", &tm);
tmshow(&tm,"strptime");
// NOTE: timegm ignores this -- so remember it
time_t offset = tm.tm_gmtoff;
//tm.tm_gmtoff = 0;
time_t epoch_timegm = timegm(&tm);
todshow(epoch_timegm,1,"timegm");
// adjust for timezone -- this produces correct GMT
todshow(epoch_timegm - offset,1,"timegm+offset");
// NOTE/BUG: setting TZ does _not_ work
#if 0
time_t epoch_mktime = epoch_timegm;
epoch_mktime -= offset;
setenv("TZ", "US/Pacific", 1);
localtime_r(&epoch_mktime,&tm);
#endif
#if 1
time_t epoch_mktime = epoch_timegm;
//epoch_mktime += offset;
//epoch_mktime += offset;
gmtime_r(&epoch_mktime,&tm);
tm.tm_gmtoff += offset;
//tm.tm_gmtoff += offset;
#endif
//printf("Parsed mktime -> epoch %lu\n", epoch_mktime); // 7:04AM UTC
tmshow(&tm,"localtime_r");
time_t diff = epoch_mktime - epoch_timegm;
printf("diff = %ld (%.3f)\n",diff,diff / 3600.0);
}
int
main()
{
char buf[40];
// Pacific time???
strcpy(buf, "2023-06-07 03:04:56 -0700");
orig(buf);
fix1(buf);
fix2(buf);
return 0;
}
Here is the program output:
--------------------------------------------------------------------------------
ORIG:
BUF: 2023-06-07 03:04:56 -0700
Parsed datetime 2023-06-07 03:04:56 -0700 (hour 3, offset 18446744073709526416)
TMX: 2023/06/07-03:04:56 (-25200/-7) (from strptime)
Parsed mktime -> epoch 1686121496
Parsed timegm -> epoch 1686107096
diff = 14400 (4.000)
--------------------------------------------------------------------------------
FIX1:
BUF: 2023-06-07 03:04:56 -0700
TMX: 2023/06/07-03:04:56 (-25200/-7) (from strptime)
TOD: 1686121496 (from mktime)
TMX: 2023/06/07-03:04:56 (-14400/-4) (from mktime)
TOD: 1686107096 (from timegm)
TMX: 2023/06/07-03:04:56 (0/0) (from timegm)
diff = 14400 (4.000)
--------------------------------------------------------------------------------
FIX2:
BUF: 2023-06-07 03:04:56 -0700
TMX: 2023/06/07-03:04:56 (-25200/-7) (from strptime)
TOD: 1686107096 (from timegm)
TMX: 2023/06/07-03:04:56 (0/0) (from timegm)
TOD: 1686132296 (from timegm+offset)
TMX: 2023/06/07-10:04:56 (0/0) (from timegm+offset)
TMX: 2023/06/07-03:04:56 (-25200/-7) (from localtime_r)
diff = 0 (0.000)