avroavro-tools

Converting an AVDL file into something Apache's avro python package can parse


What I would like to be able to do is take an .avdl file and parse it into python. I would like to make use of the information from within python.

According to the documentation, Apache's python package does not handle .avdl files. I need to use their avro-tools to convert the .avdl file into something it does know how to parse.

According to the documentation at https://avro.apache.org/docs/current/idl.html, I can convert a .avdl file into a .avpr file with the following command:

java -jar avro-tools.jar idl src/test/idl/input/namespaces.avdl /tmp/namespaces.avpr

I ran through my .avdl file through Avro-tools, and it produced an .avpr file.

What is unclear is how I can use the python package to interpret this data. I tried something simple...

schema = avro.schema.parse(open("my.avpr", "rb").read())

but that generates the error:

SchemaParseException: No "type" property:

I believe that avro.schema.parse is designed to parse .avsc files (?). However, it is unclear how I can use avro-tools to convert my .avdl into .avsc. Is that possible?

I am guessing there are many pieces I am missing and do not quite understand (yet) what the purpose of all of these files are.

It does appear that an .avpr is a JSON file (?) so I can just read and interpret it myself, but I was hoping that there would be a python package that would assist me in navigating the data.

Can anyone provide some insights into this? Thank you.


Solution

  • The answer is to use the idl2schemata command with avro-tools.jar, providing it with an output directory to which it can write the .avsc files. The .avsc files can then be read AVRO python package.

    For example:

    java -jar avro-tools.jar idl2schemata src/test/idl/input/namespaces.avdl /tmp/