csvmergestatausingdta

file "(...).csv" not Stata file error in using merge command


I use Stata 12.

I want to add some country code identifiers from file df_all_cities.csv onto my working data.

However, this line of code:

merge 1:1 city country using "df_all_cities.csv", nogen keep(1 3) 

Gives me the error:

. run "/var/folders/jg/k6r503pd64bf15kcf394w5mr0000gn/T//SD44694.000000"
file df_all_cities.csv not Stata format
r(610);

This is an attempted solution to my previous problem of the file being a dta file not working on this version of Stata, so I used R to convert it to .csv, but that also doesn't work. I assume it's because the command itself "using" doesn't work with csv files, but how would I write it instead?


Solution

  • Your intuition is right. The command merge cannot read a .csv file directly. (using is technically not a command here, it is a common syntax tag indicating a file path follows.)

    You need to read the .csv file with the command insheet. You can use it like this.

    * Preserve saves a snapshot of your data which is brought back at "restore"
    preserve 
        
        * Read the csv file. clear can safely be used as data is preserved
        insheet using "df_all_cities.csv", clear
        
        * Create a tempfile where the data can be saved in .dta format
        tempfile country_codes
        save `country_codes'
    
    * Bring back into working memory the snapshot saved at "preserve"
    restore
    
    * Merge your country codes from the tempfile to the data now back in working memory
    merge 1:1 city country using `country_codes', nogen keep(1 3) 
    

    See how insheet is also using using and this command accepts .csv files.