parsingrdfjenardfaany23

How to add Apache Any23 RDF Statements to Apache Jena?


Basically, I use the Any23 distiller to extract RDF statements from files embedded with RDFa (The actual files where created by DBpedia Spotlight using the xhtml+xml output option). By using Any23 RDFa distiller I can extract the RDF statements (I also tried using Java-RDFa but I could only extract the prefixes!). However, when I try to pass the statements to a Jena model and print the results to the console, nothing happens!

This is the code I am using :

File myFile = new File("T1");
Any23 runner= new Any23();

DocumentSource source = new FileDocumentSource(myFile); 
ByteArrayOutputStream outA = new ByteArrayOutputStream();
InputStream decodedInput=new ByteArrayInputStream(outA.toByteArray()); //convert the output stream to input so i can pass it to jena model
TripleHandler writer = new NTriplesWriter(outA);

try {
    runner.extract(source, writer);
} finally {
    writer.close();
}

String ttl = outA.toString("UTF-8");
System.out.println(ttl);
System.out.println();
System.out.println();

Model model = ModelFactory.createDefaultModel();
model.read(decodedInput, null, "N-TRIPLE");

model.write(System.out, "TURTLE"); // prints nothing!  

Can anyone tell me what I have done wrong? Probably multiple things!
Is there any easy way i can extract the subjects of the RDF statements directly from any23 (bypassing Jena)? As I am quite inexperienced in programming any help would be really appreciated!


Solution

  • You are calling

    InputStream decodedInput=new ByteArrayInputStream(outA.toByteArray()) ;
    

    before calling any23 to insert triples. At the point of the call, it's empty.

    Move this after the try-catch block.