I have a large SAS dataset which I am trying to narrow down to specific counties in the US. The variables I have are: 'state_FIPS_code' and 'county_FIPS_code'. Each state has a unique FIPS code (e.g. California FIPS = 006, New York FIPS = 036). However, some counties can have the same FIPS code as other counties in another states (e.g. Autauga County, Alabama FIPS = 001, and Kent County, Delaware FIPS = 001).
So, in order to narrow down the dataset to the counties I need, I first thought of narrowing down the states, and then narrowing down the counties. The problem is however, unwanted counties would be included as well. For example, if I narrow it down to Florida, Alabama, and Texas, and then narrow down the county FIPS codes to 1, 3, and 5. Then it might include county #5 in Florida even though I only want the one in Texas.
I have been struggling with this issue for weeks and can't seem to figure it out. I'm trying to find the most efficient way to do this without having to write 500+ lines of code. Thank you in advance for your help.
The FIPS codes for counties already includes the FIPS codes for the state. https://transition.fcc.gov/oet/info/maps/census/fips/fips.txt
Examples:
county-level place
FIPS code name
------------ --------------
01000 Alabama
01001 Autauga County
01003 Baldwin County
...
02000 Alaska
02013 Aleutians East Borough
02016 Aleutians West Census Area
...
If for some reason your dataset has only stored the last 3 digits of the county codes you can combine them back together to get the actual FIPS county code.
So if your variables are already CHARACTER strings then use
where cats(state_FIPS_code,county_FIPS_code) in
('01001'
,'01003'
,'02013'
,'02016'
)
If your variables are numeric instead then first use the PUT() function to convert them to strings.
where put(state_FIPS_code,Z2.)||put(county_FIPS_code,Z3.) in ...
Normally to avoid typing a long list of codes like that you would put the list into a dataset and then just combine the two datasets.
If the data is already sorted by state_FIPS_code and county_FIPS_code then a simple MERGE is the easiest. For example here is what such a merge would look like if you have the county codes split into two variables:
data want ;
merge have(in=in1) county_list(in=in2);
by state_FIPS_code county_FIPS_code;
if in1 and in2;
run;