inputsasinvalid-characters

Reading in data starting with a single quotation mark SAS


I'm having an issue with reading in data from a pipe delimited txt file. There are 3 records with these values for address_1 "'S example,". All of these records are followed by a few blank fields. address_1 ends up being "S example |||" for some reason SAS eats the "'" and then ends up ignoring the pipe delimiter. Does anyone know how to properly read in values which begin with a single quotation mark?

This is what I use for reading in the file

 data test;
attrib
        ID     length= $16
        ADDRESS_1        length= $50
        ADDRESS_2        length= $50
        ADDRESS_3        length= $50
        EMAIL_ADDRESS    length= $60;
      INFILE "&directory./Data/Example_Data.txt"
        dlm="|"   RECFM=v lrecl=32767 dsd firstobs=2 missover end=eof;


    input 
        ID     $
        ADDRESS_1        : $CHAR50.
        ADDRESS_2        : $CHAR50.
        ADDRESS_3        : $CHAR50.
        EMAIL_ADDRESS    $
;
run;

Here's some example data just copy and paste into a notepad

ID|Address_1|Address_2|Address_3|Email
1234|'S Road,|||email@GMAIL.COM
2534|'S Road,|||email2@GMAIL.COM

Annoyingly when i try this with example data it works but when I use my actual data the ' disappears and the fields get concatenated into 1.

Here's the code generated by SAS EG which reads my original data file in just fine

data example;
Length 
     Address_1 $40;
Format 
    ADDRESS_1        $CHAR40.;
Informat 
        ADDRESS_1        $CHAR40.;

infile "test_data.txt"
        LRECL=226
        ENCODING="LATIN1"
        TERMSTR=CRLF
        DLM='7F'x
        MISSOVER
        DSD ;

Input
        ADDRESS_1        : $CHAR40.;
run;

Solution

  • Show your data lines.

    Values containing delimiters or quotes should be quoted. So if the value of address_1 is 'S example, then in the delimited file it should be represented as

    "'S example,"
    

    or

    '''S example,'
    

    The DSD option on the INFILE statement that recognizes adjacent delimiters as indicating a missing value will also remove the quotes from around quoted values.

    You might be hitting a similar bug as in this report at SAS Communities. https://communities.sas.com/t5/Base-SAS-Programming/Importing-delimited-file-with-text-qualifiers/m-p/449215#M113034

    In that case the values could be parsed from the data line using the scan() function with the m modifier without the q modifier.

    You could also try pre-processing the _INFILE_ variable to convert bare single quotes to "'". Here is a simple example.

    data test2 ;
      length var1-var5 $50 ;
      infile cards dsd dlm='|' truncover firstobs=2;
      input @;
      _infile_=tranwrd(cats('|',_infile_,'|'),"|'|","|""'""|");
      _infile_=substrn(_infile_,2,length(_infile_)-2);
      input var1-var5 ;
    cards4;
    ID|Address_1|Address_2|Address_3|Email
    2534|'S Road,||'|email2@GMAIL.COM
    ;;;;