What I would like to be able to do is take an .avdl file and parse it into python. I would like to make use of the information from within python.
According to the documentation, Apache's python package does not handle .avdl files. I need to use their avro-tools
to convert the .avdl file into something it does know how to parse.
According to the documentation at https://avro.apache.org/docs/current/idl.html, I can convert a .avdl file into a .avpr file with the following command:
java -jar avro-tools.jar idl src/test/idl/input/namespaces.avdl /tmp/namespaces.avpr
I ran through my .avdl file through Avro-tools, and it produced an .avpr file.
What is unclear is how I can use the python package to interpret this data. I tried something simple...
schema = avro.schema.parse(open("my.avpr", "rb").read())
but that generates the error:
SchemaParseException: No "type" property:
I believe that avro.schema.parse
is designed to parse .avsc files (?). However, it is unclear how I can use avro-tools
to convert my .avdl into .avsc. Is that possible?
I am guessing there are many pieces I am missing and do not quite understand (yet) what the purpose of all of these files are.
It does appear that an .avpr is a JSON file (?) so I can just read and interpret it myself, but I was hoping that there would be a python package that would assist me in navigating the data.
Can anyone provide some insights into this? Thank you.
The answer is to use the idl2schemata
command with avro-tools.jar, providing it with an output directory to which it can write the .avsc files. The .avsc files can then be read AVRO python package.
For example:
java -jar avro-tools.jar idl2schemata src/test/idl/input/namespaces.avdl /tmp/