rdelimiterread.tablepsse

Parsing a text file containing multiple sections in R


I have a text file containing 130 tables, separated by the delimiter ' DLM'. I tried using the package reader in R and defined the default delimiter as follows, but it still reads the whole file.

reader::reader("Path_to_file.txt", def= "\\' DLM'", one.byte = FALSE)

Is it possible to parse the file and read only the lines that are specific to the table name? For example, if I specify 'B2', can I read only the rows of table B2? I can't seem to get around the delimiter issue in reader. Any help is appreciated!

Sample dataset:

'A1',2018,10,'655033655206 1',,,81,
'A1',2019,4,'655033655206 1',,,63,
'A1',2011,1,'655034655045 1',.03486,.05829,52,


' DLM','B2',2011,1,'5BON AQUA TP',361239,161,,,0,
'B2',2001,1,'5BON AQUA TN',361240,161,22.7,4.97,0,
'B2',2002,1,'5CON FIRE TN',363240,161,22.7,4.97,0,


' DLM','C1','CGDF09',
'C1','W XYZ',
'C1','A BCD',

Solution

  • Maybe try removing delimiter and then check which line starts with 'B2' ? You can use this function from stringi package:

    stri_startswith_fixed(c("A1,1,2,3","B2,3,4,5","C2,3,,5"), "B2")
    # [1] FALSE  TRUE FALSE