I'm having an issue with reading in data from a pipe delimited txt file. There are 3 records with these values for address_1 "'S example,". All of these records are followed by a few blank fields. address_1 ends up being "S example |||" for some reason SAS eats the "'" and then ends up ignoring the pipe delimiter. Does anyone know how to properly read in values which begin with a single quotation mark?
This is what I use for reading in the file
data test;
attrib
ID length= $16
ADDRESS_1 length= $50
ADDRESS_2 length= $50
ADDRESS_3 length= $50
EMAIL_ADDRESS length= $60;
INFILE "&directory./Data/Example_Data.txt"
dlm="|" RECFM=v lrecl=32767 dsd firstobs=2 missover end=eof;
input
ID $
ADDRESS_1 : $CHAR50.
ADDRESS_2 : $CHAR50.
ADDRESS_3 : $CHAR50.
EMAIL_ADDRESS $
;
run;
Here's some example data just copy and paste into a notepad
ID|Address_1|Address_2|Address_3|Email
1234|'S Road,|||email@GMAIL.COM
2534|'S Road,|||email2@GMAIL.COM
Annoyingly when i try this with example data it works but when I use my actual data the ' disappears and the fields get concatenated into 1.
Here's the code generated by SAS EG which reads my original data file in just fine
data example;
Length
Address_1 $40;
Format
ADDRESS_1 $CHAR40.;
Informat
ADDRESS_1 $CHAR40.;
infile "test_data.txt"
LRECL=226
ENCODING="LATIN1"
TERMSTR=CRLF
DLM='7F'x
MISSOVER
DSD ;
Input
ADDRESS_1 : $CHAR40.;
run;
Show your data lines.
Values containing delimiters or quotes should be quoted. So if the value of address_1
is 'S example,
then in the delimited file it should be represented as
"'S example,"
or
'''S example,'
The DSD
option on the INFILE
statement that recognizes adjacent delimiters as indicating a missing value will also remove the quotes from around quoted values.
You might be hitting a similar bug as in this report at SAS Communities. https://communities.sas.com/t5/Base-SAS-Programming/Importing-delimited-file-with-text-qualifiers/m-p/449215#M113034
In that case the values could be parsed from the data line using the scan()
function with the m
modifier without the q
modifier.
You could also try pre-processing the _INFILE_
variable to convert bare single quotes to "'"
. Here is a simple example.
data test2 ;
length var1-var5 $50 ;
infile cards dsd dlm='|' truncover firstobs=2;
input @;
_infile_=tranwrd(cats('|',_infile_,'|'),"|'|","|""'""|");
_infile_=substrn(_infile_,2,length(_infile_)-2);
input var1-var5 ;
cards4;
ID|Address_1|Address_2|Address_3|Email
2534|'S Road,||'|email2@GMAIL.COM
;;;;