I have a large RDF file (in the gigabytes) that I'd like imported into a remote graph database.
The database exposes a Graph Store Protocol endpoint over the RDF4J API. Of the many ingest routes the database supports, the only one acceptable in my scenario is using this endpoint (posting to /statements
).
The database is hosted on a cloud provider, and various infrastructure layers (web server, application container) impose upload limits, so I can't just post the file.
Using dotNetRDF, how can I load a lot of RDF into a remote database over Graph Store in chunks?
WriteToStoreHandler writes RDF direct to a storage provider with configurable batch size.
SesameHttpProtocolConnector is a storage provider that supports the RDF4J API, which includes Graph Store Protocol.
var path = "large-rdf-file-path.nt";
var server = "http://example.com/sparql-protocol-base-url";
var repository = "repository-name";
var batchSize = 10;
using (var connector = new SesameHttpProtocolConnector(server, repository))
{
var handler = new WriteToStoreHandler(connector, batchSize);
using (var reader = File.OpenText(path))
{
var parser = new NTriplesParser();
parser.Load(handler, reader);
}
}