I want to build a parser for fixed position text files.
What I want to achieve is to make it dynamic so that I could pass an external configuration file containing the format of the file that will be parsed.
Example of configuration file to make the application to load:
Field; Position
Name;0-20
Surname;21-40
Age;40-42
Sex;42-43
...
Example of file to parse:
John William Hoover23M
Deborah Foobar33F
...
I saw googling lot of libraries to parse fixed length file.
Problem is that all of them relies on creating some classes with annotated fields telling the fixed position in the text file.
I want to make a generic parser so this classes should be automatically generated and annotated based on some external configuration file.
Do you know any library or different kind of approach that I could follow?
I'm talking about parsing relatively big files around ~500Mb so also efficiency and speed is important factor.
Thank you all!
You dont need to "parse" the big file. You only need to extract at given positions
1 parse the "format" file, with classical regex, and store name, positions in an array. Time doesnt matter there.
2 open your big file, read the lines, and extract at the positions you want. It will be the faster your could do.