rcsvstringrreadr

Using readr to read TSV where some fields have quotes. After writing back out, there is an extra set of quotes


I have a TSV where one of the fields is a string representation of an array, it looks something like this:

A    B
1    ["hello", "to", "you"]
2    ["some"]
3    ["stuff", "blah"]

I'm trying to read it in using readr

library(readr)
df = read_tsv('file.tsv', quote = '\"')

Then I'm writing it out using write_tsv(df, 'out.tsv')

The problem is that when I open out.tsv the result is the follow:

A    B
1    "[""hello"", ""to"", ""you""]"
2    "[""some""]"
3    "[""stuff"", ""blah""]"

In reading the file in I have tried quote = '', quote = '\"' and quote = '\\"' in read_csv. write_csv doesn't have a quote parameter so I can't set anything there. How do I read in this file and write it back out so that the extra set of quotes don't get written out?


Solution

  • write_csv now has the the escape argument (quote_escape has been deprecated). escape can take one of three options:

    • "double" - quotes are escaped by doubling them.

    • "backslash" - quotes are escaped by a preceding backslash.

    • "none" - quotes are not escaped.

    For more information, read the documentation.