r

Reading C style date and time from a binary file using R


I am trying to use R to read "C style" date and time data from a binary file.

Apparently the date is stored in a 2 byte WORD ("directory style") while time is stored in a 4 byte time_t variable.

I have code in C# that can perform this conversion as follows:

...

    internal struct Systime
    {
        internal UInt16 Year;
        internal UInt16 Month;
        internal UInt16 DayOfWeek;
        internal UInt16 Day;
        internal UInt16 Hour;
        internal UInt16 Minute;
        internal UInt16 Second;
        internal UInt16 Milliseconds;
    }

    public struct FileTime
    {
        internal UInt32 dwLowDateTime;
        internal UInt32 dwHighDateTime;
    }

    [DllImport("Kernel32.dll", CharSet = CharSet.Ansi)]
    private static extern bool DosDateTimeToFileTime(UInt16 wFatDate, UInt16 wFatTime, ref FileTime lpFileTime);

    [DllImport("Kernel32.dll", CharSet = CharSet.Ansi)]
    private static extern bool FileTimeToSystemTime(ref FileTime lpFileTime, ref Systime lpSystemTime);



    public static DateTime FromMSDosDate(UInt16 toConvert)
    {
        FileTime fileTime = new FileTime();
        Systime systemTime = new Systime();

        DosDateTimeToFileTime(toConvert, 0, ref fileTime);
        FileTimeToSystemTime(ref fileTime, ref systemTime);

        return new DateTime(systemTime.Year, systemTime.Month, systemTime.Day, 0, 0, 0);
    }

...

When I read a 2 byte integer date of 22775, the FromMSDosDate() function above returns 2024-07-23.

How would I achieve this in R?

Should I be using the Rcpp package to handle it in C/C++? I'll have to get help with that... Then there is dyn.load() but that seems confusing when passing pointers and values by reference as required with the API calls that I need to make. I have briefly looked into the R.oo package to enable passing by reference but this all seems overkill for what should be a simple task.

Yes, extracting time_t is much simpler - just a question of adding to midnight 1970, the four byte integer as seconds, so not hard to do in R

Many thanks in advance for any guidance on the best approach to take here.


Solution

  • I find it overkill but since you mention C++, something like this?

    Write a file DOSDateTime.cpp with this code:

    // filename: DOSDateTime.cpp
    #include <Rcpp.h>
    
    #define dos_year(x)   ((((x) >> 9) & 0x7F) + 1980)
    #define dos_month(x)  (((x) >> 5) & 0x0F)
    #define dos_day(x)    ((x) & 0x1F)
    
    #define dos_secs(x)   (((x) & 0x1F) >> 1)
    #define dos_mins(x)   (((x) >> 5) & 0x3F)
    #define dos_hours(x)  ((x) >> 11)
    
    Rcpp::IntegerVector dos_date_time(const R_len_t x) {
        Rcpp::IntegerVector dt(6);
        dt(0) = dos_year(x);
        dt(1) = dos_month(x);
        dt(2) = dos_day(x);
        dt(3) = dos_hours(x);
        dt(4) = dos_mins(x);
        dt(5) = dos_secs(x);
        return dt;
    }
    
    Rcpp::Date dos_date(const R_len_t x) {
        Rcpp::IntegerVector d(3);
        d(0) = dos_year(x);
        d(1) = dos_month(x);
        d(2) = dos_day(x);
        // Rcpp::Date format is month, day, year
        Rcpp::Date date = Rcpp::Date(d(1), d(2), d(0));
        return date;
    }
    
    // [[Rcpp::export]]
    Rcpp::DateVector DOSDate(const Rcpp::IntegerVector x) {
        Rcpp::DateVector dates(x.size());
        std::transform(x.begin(), x.end(), dates.begin(), dos_date);
        return dates;
    }
    
    // [[Rcpp::export]]
    Rcpp::IntegerMatrix DOSDateTime(const Rcpp::IntegerVector x) {
        Rcpp::IntegerMatrix y(x.size(), 6);
        for(R_xlen_t i = 0; i < x.size(); ++i)
            y(i, Rcpp::_) = dos_date_time(x(i));
        return y;
    }
    

    Now the R code.
    The R function calls base function ISOdatetime to coerce the year/month/day/hours/mins/secs to R's "POSIXct" class. This is because according to this post it is probably better to work with NumericVector than with Datetime class objects.

    library(Rcpp)
    
    sourceCpp("DOSDateTime.cpp")
    
    DOSdatetime <- function(x) {
      y <- DOSDateTime(x)
      ISOdatetime(y[, 1L], y[, 2L], y[, 3L], y[, 4L], y[, 5L], y[, 6L])
    }
    
    # 22775 gives 2024-07-23
    # 22727 gives 2024-06-07
    # 22841 gives 2024-09-25 
    # 22886 gives 2024-11-06 
    # 19307 gives 2017-11-11
    x <- c(22775, 22727, 22841, 22886, 19307)
    
    DOSDate(x)
    #> [1] "2024-07-23" "2024-06-07" "2024-09-25" "2024-11-06" "2017-11-11"
    DOSdatetime(x)
    #> [1] "2024-07-23 11:07:11 WEST" "2024-06-07 11:06:03 WEST"
    #> [3] "2024-09-25 11:09:12 WEST" "2024-11-06 11:11:03 WET" 
    #> [5] "2017-11-11 09:27:05 WET"
    

    Created on 2025-01-14 with reprex v2.1.1


    Edit

    Following Konrad Rudolph's comment, here are base R solutions.

    DOS_date <- function(x) {
      yr <- bitwAnd(bitwShiftR(x, 9), 0x7f) + 1980
      mth <- bitwAnd(bitwShiftR(x, 5), 0x0f)
      day <- bitwAnd(x, 0x1f)
      ISOdate(yr, mth, day) |> as.Date()
    }
    DOS_datetime <- function(x) {
      yr <- bitwAnd(bitwShiftR(x, 9), 0x7f) + 1980
      mth <- bitwAnd(bitwShiftR(x, 5), 0x0f)
      day <- bitwAnd(x, 0x1f)
      hr <- bitwShiftR(x, 11)
      mn <- bitwAnd(bitwShiftR(x, 5), 0x3f)
      sc <- bitwShiftR(bitwAnd(x, 0x1f), 1)
      ISOdatetime(yr, mth, day, hr, mn, sc)
    }
    
    x <- c(22775, 22727, 22841, 22886, 19307)
    DOS_date(x)
    #> [1] "2024-07-23" "2024-06-07" "2024-09-25" "2024-11-06" "2017-11-11"
    DOS_datetime(x)
    #> [1] "2024-07-23 11:07:11 WEST" "2024-06-07 11:06:03 WEST"
    #> [3] "2024-09-25 11:09:12 WEST" "2024-11-06 11:11:03 WET" 
    #> [5] "2017-11-11 09:27:05 WET"
    

    Created on 2025-01-13 with reprex v2.1.1