dockerjenafusekitriplestore

How can I add a new dataset to Apache Fuseki using the command line?


I'm following the instructions for this Docker image, which describes how to set up a new containerized RDF triplestore using Apache Fuseki. I think I can automate all the steps in those instructions for my data set using a Dockerfile, but there's one step, under "recognizing the dataset in Fuseki," that has you enter the GUI interface and add a new dataset there. Since I'd eventually like to automate this process, I'd like to find a command-line way to add a new dataset. It doesn't need to be anything fancy, just add a new dataset with a given name, like "db." Is there a way to do that? (And also, I guess, a way to run that command in the docker container?)


Solution

  • Here is what you need to do:

    (1) Start your container with docker run -p 3030:3030 -it stain/jena-fuseki.

    (2) Find your container's id $$$ with docker ps.

    (3) Copy a config.ttl file to your docker container with docker container cp config.ttl $$$:config.ttl. An example config.ttl can look as follows:

    @prefix fuseki:  <http://jena.apache.org/fuseki#> .
    @prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
    @prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
    @prefix tdb:     <http://jena.hpl.hp.com/2008/tdb#> .
    @prefix ja:      <http://jena.hpl.hp.com/2005/11/Assembler#> .
    @prefix :        <#> .
    
    <#service1> rdf:type fuseki:Service ;
        fuseki:name                       "ds" ;       # http://host:port/ds
        fuseki:serviceQuery               "sparql" ;   # SPARQL query service
        fuseki:serviceQuery               "query" ;    # SPARQL query service (alt name)
        fuseki:serviceUpdate              "update" ;   # SPARQL update service
        fuseki:serviceUpload              "upload" ;   # Non-SPARQL upload service
        fuseki:serviceReadWriteGraphStore "data" ;     # SPARQL Graph store protocol (read and write)
        # A separate read-only graph store endpoint:
        fuseki:serviceReadGraphStore      "get" ;      # SPARQL Graph store protocol (read only)
        fuseki:dataset                   <#dataset> ;
        .
    
    <#dataset> rdf:type      tdb:DatasetTDB ;
        tdb:location "DB" ;
        # Query timeout on this dataset (1s, 1000 milliseconds)
        ja:context [ ja:cxtName "arq:queryTimeout" ;  ja:cxtValue "1000" ] ;
        # Make the default graph be the union of all named graphs.
        ## tdb:unionDefaultGraph true ;
         .
    

    (4) Commit the changes to your container with docker container commit $$$ stackoverflow/jena-fuseki:latest.

    (5) Restart your container with: docker run -p 3030:3030 -it stackoverflow/jena-fuseki ./fuseki-server --config=/config.ttl.

    (6) If you now go to http://localhost:3030 you should see your dataset.