i need to parse gedcom 5.5 files for a analyziation project. The first ruby parser i found couses a stack level too deep error, so i tryed to find alternatives. I fount this project: https://github.com/jslade/gedcom-ruby
There are some samples included, but i don't get them to work either.
Here is the parser itself: https://github.com/jslade/gedcom-ruby/blob/master/lib/gedcom.rb
If i try the sample like this:
ruby ./samples/count.rb ./samples/royal.ged
i get the following error:
D:/rails_projects/gedom_test/lib/gedcom.rb:185:in `readchar': end of file reached (EOFError)
I wrote a "gets" in every method for better unterstanding, this is the output till the exception raises:
Parsing './samples/royal.ged'...
INIT
BEFORE
CHECK_PROC_OR_BLOCK
BEFORE
CHECK_PROC_OR_BLOCK
PARSE
PARSE_FILE
PARSE_IO
DETECT_RS
The exact line that causes the trouble is
while ch = io.readchar
in the detect_rs method:
# valid gedcom may use either of \r or \r\n as the record separator.
# just in case, also detects simple \n as the separator as well
# detects the rs for this string by scanning ahead to the first occurence
# of either \r or \n, and checking the character after it
def detect_rs io
puts "DETECT_RS"
rs = "\x0d"
mark = io.pos
begin
while ch = io.readchar
case ch
when 0x0d
ch2 = io.readchar
if ch2 == 0x0a
rs = "\x0d\x0a"
end
break
when 0x0a
rs = "\x0a"
break
end
end
ensure
io.pos = mark
end
rs
end
I hope someone can help me with this.
The readchar
method of Ruby's IO
class will raise an EOFError
when it encounters the end of the file. http://www.ruby-doc.org/core-2.1.1/IO.html#method-i-readchar
The gedcom-ruby
gem hasn't been touched in years, but there was a fork of it made a couple of years go to fix this very problem.
Basically it changes:
while ch = io.readchar
to
while !io.eof && ch = io.readchar
You can get the fork of the gem here: https://github.com/trentlarson/gedcom-ruby