jenatdb

Loading large, compressed RDF file into Jena TDB2 without decompression


I am trying to load a large RDF file (some hundred million triples) into a Jena TDB2 database. Fuseki with Graph Store Protocol does not work because the file is too large for clients. The file is compressed NTriples, roughly 20 times smaller then uncompressed NTriples file. Is it possible to load the data while uncompressing it on-the-fly? I tried with a named pipe but this does not work:

$ tdb2.tdbloader --loc $DB <(zcat rdf.nt.gz)
Can't read file : /dev/fd/63

Solution

  • As mentioned by AndyS, compressed files can directly be passed to Jena TDB2 command line tools, so the this example works:

    $ tdb2.tdbloader --loc $DB rdf.nt.gz