I have the following data:
structure(list(QB5B_2 = structure("Car les GAFA sont des sociétés Américaines et de plus les gouvernements qui composent l'Union Européenne ne sont pas d'accord entre elles sur la stratégie à adopter en ce qui les concerne . Exemple les Gafa payent des impots en Irlande car leurs si<ef>", label = "test", format.spss = "A255", display_width = 0L)), row.names = c(NA,
-1L), class = c("tbl_df", "tbl", "data.frame"))
When I look at this data in RStudios View pane, it looks like proper French text:
View(problem)
shows:
However, when looking at the data in the console it gives me:
# A tibble: 1 x 1
QB5B_2
<chr>
1 "Car les GAFA sont des soci\xc3\xa9t\xc3\xa9s Am\xc3\xa9ricaines et de plus les gouvernements qui composent l'Union Eu~
So it's clear there is some character encoding problem.
Now, when I try to export the file to Excel with:
library(writexl)
write_xlsx(problem, "test.xlsx")
it does the exporting but I can't open the file in Excel and instead get an error message that a problem has been encountered. Side note: I can import the Excel file perfectly fine with e.g. readxl::read_xlsx("test.xlsx")
So two questions:
\xc3\
things in the data.Something is quite strange since your input shows a double quote before the text, which usually does not happen when displaying the content of a character-column in a tibble. Look right after the "1":
# A tibble: 1 x 1
QB5B_2
<chr>
1 "Car les GAFA sont des soci\xc3\xa9t\xc3\xa9s Am\xc3\xa9ricaines et de plus les gouvernements qui composent l'Union Eu~
Perhaps a solution is to reencode the variable using iconv()
:
problem$QB5B_2 <- iconv(problem$QB5B_2, sub = "byte")
problem
# A tibble: 1 x 1
QB5B_2
<chr>
1 Car les GAFA sont des sociétés Américaines et de plus les gouvernements qui composent l'Union Européenne ne sont pas …
Another would be to remove the first character:
problem$QB5B_2 <- str_remove(problem$QB5B_2, pattern = "$.")
problem
# A tibble: 1 x 1
QB5B_2
<chr>
1 Car les GAFA sont des sociétés Américaines et de plus les gouvernements qui composent l'Union Européenne ne sont pas …
This does not show how to avoid the issue in the first place, but it should sort you out.
One difficulty in doing the debugging here is that dput()
, which you probably used to replicate the content does not keep the problem...