rutf-8gbk

Why R & Studio in Mac cannot import RData with Chinese content (created in win10)


I have been chasing this issue for all day long.

I downloaded exercise materials for a text book via: http://www.crup.com.cn/UploadFiles/jxkj/gsgl/243184/统计学基于R第二版例题和习题数据.rar

However, these RData work well in R & Rstudio in win10, while cannot show chinese characters in Mac

Rstudio in win10:

link: screenshot 1

link: screenshot 2

Rstudio in Mac

Chinese characters crash

link: screenshot 3

R console in Mac

Chinese characters crash

link: screenshot 4

I have searched lots of the solutions on the websites, most of which are about how to fix the problem on importing ".csv" doc. But my question is about how to load RData into R without chinese characters crashing.

Some answers mentioned that we should switch the "default text encoding" in "Global options" of Rstudio into "UTF-8", but I have checked Rstudio in Mac and Win10, and they both are in "UTF-8" mode.

Therefore I really don't know what's the real problem is.


Solution

  • Perhaps there's a better solution that works globally, but one way is to convert the encodings for each object separately:

    load("~/Downloads/exercise1_1.RData")
    exercise1_1[, 1:3]
    #                    ָ\xb1\xea X2008\xc4\xea X2009\xc4\xea
    # 1    \xb5\xcd\xca\xd5\xc8뻧          1500          1549
    # 2      \xd6е\xc8ƫ\xcf»\xa7          2935          3110
    # 3  \xd6е\xc8\xca\xd5\xc8뻧          4203          4502
    # 4 \xd6е\xc8ƫ\xc9\u03fb\xa7          5929          6468
    # 5    \xb8\xdf\xca\xd5\xc8뻧         11290         12319
    
    names(exercise1_1) <- iconv(names(exercise1_1), from = "GB2312", to = "UTF-8")
    exercise1_1 <- lapply(exercise1_1, function(x) if(is.factor(x)) as.character(x) else x)
    exercise1_1 <- data.frame(lapply(exercise1_1, function(x) {
      if(is.character(x)) 
        iconv(x, from = "GB2312", to = "UTF-8")
      else 
        x
      }
    ))
    
    exercise1_1[, 1:3]
    #         指标 X2008年 X2009年
    # 1   低收入户    1500    1549
    # 2 中等偏下户    2935    3110
    # 3 中等收入户    4203    4502
    # 4 中等偏上户    5929    6468
    # 5   高收入户   11290   12319