pythonapache-sparkpysparkontologyowlready

Loading an ontology from string in Python


We have a Pyspark pair RDD which stores the path of .owl files as key and the file contents as value.

I wish to carry out reasoning using Owlready2. To load an ontology from OWL files, the get_ontology() function is used. However, the given function expects an IRI (a sort of URL) to the file, whereas I have the file contents as a str in Python.

Is there a way I could make this work out?

I have tried the following:


Solution

  • Answering my own question.

    The load() function in Owlready2 has a couple of more arguments which are not mentioned anywhere in the documentation. The function definitions of the package can be seen here.

    Quoting from there, def load(self, only_local = False, fileobj = None, reload = False, reload_if_newer = False, **args) is the function signature.

    We can see that a fileobj can also be passed, which is None by default. Further, the line fileobj = open(f, "rb") tells us that the file needs to be read in binary mode.

    Taking all this into consideration, the following code worked for our situation:

    from io import BytesIO # to create a file-like object
    my_str = RDDList[1][1] # the pair RDD cell with the string data
    my_str_as_bytes = str.encode(my_str) # convert to binary
    fobj = BytesIO(my_str_as_bytes) 
    
    abox = get_ontology("some-random-path").load(fileobj=fobj) # the path is insignificant, notice the 'fileobj = fobj'.