I have a text file downloaded from the BLS website that has a lot of spaces in between columns.
Code:
data unemployment;
infile 'P:\Projects\la.data.2.AllStatesU.txt' dsd firstobs=2;
input @1 series_id : $20.
@32 year
@36 period : $3.
@51 value
@57 footnote_codes : $1.;
run;
But I get a mess of errors
NOTE: Invalid data for year in line 2 32-53.
NOTE: Invalid data for value in line 2 51-53.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9--
3 CHAR LAUST010000000000003 .1976.M02. 7.7. 53
ZONE 44555333333333333333222222222203333043302222222223230
NUMR C15340100000000000030000000000919769D0290000000007E79
The period column has the first two characters right, but the year and everything else is wrong. How do I fix this?
Snapshot of text file:
code output:
The file from that website
filename bls url
"https://download.bls.gov/pub/time.series/la/la.data.2.AllStatesU"
;
has tab characters in it. That is shown in the example you posted of line 3 from the SAS LOG.
You can either tell the INFILE statement to expand the tabs into spaces and read it as fixed column format.
data unemployment;
infile bls expandtabs firstobs=2 truncover;
input
series_id $ 1-20
year 33-36
period $ 41-43
value ?? 50-60
footnote_codes $ 65
;
run;
Or tell it that the tab character is the delimiter.
data unemployment;
infile bls dlm='09'x dsd firstobs=2 truncover;
input
series_id :$20.
year
period :$3.
value ??
footnote_codes :$1.
;
run;
Note: The ??
modifier for VALUE is because the file has a hyphen to represent missing values in that field. The ??
input modifier will tell the data step to not flag those as data errors.