importrstudior-haven

What are those red lines with dots in RStudio editor


When importing data from SPSS with haven's read_sav I get some values like "1.     Tout à fait d?accord" or "2.     Plutôtd?accord". Those values seem to have whitespaces between the period and the text, however, they are not actually "normal" whitespaces, as they can't be referenced using them (i.e. %>% filter(values == "1. Tout à fait d?accord")) won't work.

A friend told me that opening the data in STATA and then copying the lines to RStudio editor produced these weird red characters (shown in image), and the code now works. How can I read these data in R adequately without going through STATA?

enter image description here


Solution

  • Are you sure that these have 4 white spaces? It is quite likely that the data has tab as spaces which is different. When you write into a text editor:

    "I'm    tab    separated"
    

    Then copy it to the console to assign it in R:

    TestString <- "I'm    tab    separated"
    

    To test whether it is the same as 4 white spaces we can run the following test.

    # With 4 white spaces
    TestString == "I'm    tab    separated"
    [1] FALSE
    
    # Tab separated
    TestString == "I'm\ttab\tseparated"
    [1] TRUE
    

    By knowing this you can either replace the tab in the file or just address it during the subsetting (which is probably better).