pythonvalidationrdfsaxredland

Validating RDF files using Raptor or Sax


Given a RDF file, I want to write a python script to validate the file and comment if in wrong format. HOw do I do this with RAptor? or Sax or is there any other library? No luck with w3.


Solution

  • You have two options with raptor:

    Option 1: Use the rapper command line, this is super fast. The function below is an example in python to wrap up the command. The -c option is to just count the number of triples. The parameter lang is just an option to specify the RDF format ntriples, rdfxml, turtle, ... The function checks the return code and throws an exception in case anything went wrong.

    def rapper_count(f,lang):
        p=subprocess.Popen(["rapper","-i",lang,"-c",f],stdout=subprocess.PIPE,stderr=subprocess.PIPE)
        output, err = p.communicate()
        ret = p.poll()
        if ret <> 0:
            raise Exception, "Error parsing with rapper\n%s"%err
        return int(err.split()[-2])
    

    Option 2: Use the redland Python language bindings. Something like the following would work:

    import RDF
    
    test_file = "/some/file"
    
    uri=RDF.Uri(string="file:"+test_file)
    
    parser=RDF.Parser(name="turtle")
    if parser is None:
      raise Exception("Failed to create RDF.Parser raptor")
    
    count=0
    for s in parser.parse_as_stream(uri,uri):
      count=count+1
    
    print "Parsing added",count,"statements"
    

    This code has been extracted from example.py, check it out and you'll see more examples.