I have tested linkedmdb-18-05-2009-dump.nt on Java Apache Jena, but on dotNetRDF throwing an exception as
VDS.RDF.Parsing.RdfParseException
HResult=0x80131500
Message=Invalid URI encountered, see inner exception for details
Source=dotNetRDF
StackTrace:
at VDS.RDF.Parsing.NTriplesParser.TryParseUri(TokenisingParserContext context, String uri)
at VDS.RDF.Parsing.NTriplesParser.TryParseTriple(TokenisingParserContext context)
at VDS.RDF.Parsing.NTriplesParser.Parse(TokenisingParserContext context)
at VDS.RDF.Parsing.NTriplesParser.Load(IRdfHandler handler, TextReader input)
at ConsoleApp2_RDFWALKTHROUGH.Program.Main(String[] args) in
This exception was originally thrown at this call stack:
[External Code]
Inner Exception 1:
UriFormatException: Invalid URI: The hostname could not be parsed.
my c# code is as follow:
String inputFile = "D:/linkedmdb-18-05-2009-dump.nt";
IGraph g = new Graph();
NTriplesParser parser = new NTriplesParser(NTriplesSyntax.Original);
Console.WriteLine("RDF DS-1 Loading Started:");
parser.Load(g, new StreamReader(inputFile));
Console.WriteLine("RDF DS-1 Loading Finished:");
Console.WriteLine(new DateTime(loadingTime).ToShortTimeString());
Console.ReadLine();
Please guide me where I am wrong because it's very confusing that the same file is ok on Java but not parsing on dotNetRDF.
The problem is that the dump contains an invalid IRI. At line 3104575 in the dump I downloaded from https://www.cs.toronto.edu/~oktie/linkedmdb/ there is the following:
<http://data.linkedmdb.org/film/9995> <http://xmlns.com/foaf/0.1/page> <http://?> .
The last IRI on that line is the one that is causing the parser to choke as ?
is not a valid character at that position in an IRI.