sas

Format dates as sas dates


I'm in a really frustrating situation. I received a dataset to be analyzed. It contains dates as follows:

data DB;
  input date;
cards;
2014-11- 24 00:00:00
2018-04-17 00:00:00
2019/01/02 00:00:00.0
2014 -10-13 00:00:00
2019/10/25 00:00:00.0
;run;

Dates are in character mode. Moreover sometimes there are blanks, sometimes not. Is there a way to get dates in sas format while solving all these issues? Desired output:

data DB1;
  input date :date9.;
  format date date9.;
cards;
24NOV2014
17APR2018
02JAN2019
13OCT2014
25OCT2019
;run;

Thank you in advance.


Solution

  • You have two issues. One you have a character variable, but dates require a numeric variable. You will need to make a NEW variable that is numeric. You can always use RENAME to swap the names if you need to use the old name for the new variable.

    The other is your character strings might not all be date values. In fact most of them look like they are DATETIME values (which SAS stores as number of seconds) and not DATE values (which SAS stores as number of days).

    You could see if the ANYDT... series of informats can handle those strings.

    data DB;
      input date $30.;
    cards;
    2014-11- 24 00:00:00
    2018-04-17 00:00:00
    2019/01/02 00:00:00.0
    2014 -10-13 00:00:00
    2019/10/25 00:00:00.0
    ;
    
    data want;
      set db;
      dt = input(date,anydtdte30.);
      format dt date9.;
      rename dt=date date=date_char ;
    run;
    

    Works for most of them. But a couple are confusing to it.

    NOTE: Invalid argument to function INPUT at line 19 column 8.
    date=2014-11- 24 00:00:00 dt=. _ERROR_=1 _N_=1
    NOTE: Invalid argument to function INPUT at line 19 column 8.
    date=2014 -10-13 00:00:00 dt=. _ERROR_=1 _N_=4
    

    If those time of day strings always include '00:00' then you could try only using the part up to that point. And also remove the embedded spaces.

    data want;
      set db;
      dt = input(compress(substr(date,1,index(date,'00:00')-1)),anydtdte30.);
      format dt date9.;
      rename dt=date date=date_char ;
    run;
    

    That works for your examples.

    Since the date part does seem to be consistently in YMD order you might want to use the YYMMDD10. informat instead now that the time of day part has been removed. That will not mistakenly convert strings like '1-12-24' into a date like the ANYDT... series of informats would.