javaavromaven-pluginidlavro-tools

Avro Maven Plugin idl-protocol Execution Failed After Upgrading to Avro 1.12.0


After upgrading to Apache Avro 1.12.0, I encountered the following error while running the Avro Maven:

line 65:8 token recognition error at: '_'
line 65:10 token recognition error at: '_'

[ERROR] Failed to execute goal org.apache.avro:avro-maven-plugin:1.12.0:idl-protocol (idl) on project ....: 
Execution idl of goal org.apache.avro:avro-maven-plugin:1.12.0:idl-protocol failed: 
line 65:9 extraneous input '6' expecting {DocComment, 'protocol', 'namespace', 'import', 'idl', 'schema', 'enum', 'fixed', 'error', 'record', 'array', 'map', 'union', 'boolean', 'int', 'long', 'float', 'double', 'string', 'bytes', 'null', 'true', 'false', 'decimal', 'date', 'time_ms', 'timestamp_ms', 'local_timestamp_ms', 'uuid', 'void', 'oneway', 'throws', '}', '@', IdentifierToken}

Project Setup:

Avro Version: 1.12.0
Avro Maven Plugin Version: 1.12.0

Downgraded Avro to 1.11.x, and the error disappeared, suggesting a change in Avro 1.12.0’s IDL parsing.

More specific logs after mvn clean install -X

Caused by: org.apache.maven.plugin.PluginExecutionException: Execution idl of goal org.apache.avro:avro-maven-plugin:1.12.0:idl-protocol failed: line 65:9 extraneous input '6' expecting {DocComment, 'protocol', 'name
space', 'import', 'idl', 'schema', 'enum', 'fixed', 'error', 'record', 'array', 'map', 'union', 'boolean', 'int', 'long', 'float', 'double', 'string', 'bytes', 'null', 'true', 'false', 'decimal', 'date', 'time_ms', 'timestamp_ms', 'local_timestamp_ms', 'uuid', 'void', 'oneway', 'throws', '}', '@', IdentifierToken}

Has Avro 1.12.0 introduced stricter rules for IDL parsing that could cause this error? What could be causing the token recognition error at: '_'? How can I identify the specific part of my .avdl files that is incompatible with Avro 1.12.0?

this will a enum example:

enum Gear {
        NEUTRAL,
        REVERSE,
        PARK,
        GEAR_1,
        GEAR_2,
        GEAR_3,
        GEAR_4,
        GEAR_5,
        GEAR_6,
        GEAR_7,
        GEAR_8,
        GEAR_9,
        INVALID
    }

Will it be a problem? Any idea to solve it?


Solution

  • I found the issue:

    After upgrading to Apache Avro 1.12.0, I started encountering multiple syntax errors while processing my .avdl files using avro-tools-1.12.0.jar. Initially, I couldn't locate the issue because the error logs didn't specify the file names.

    Cause of the Issue: Stricter Parsing in Avro 1.12.0

    How I Found the Problematic Files

    I was using this command to convert .avdl files to .avsc:

    java -jar avro-tools-1.12.0.jar idl2schemata ./src/main/avro/dynamic/usecases/...
    

    The error log showed issues like:

    line 65:8 token recognition error at: '_'
    line 65:10 token recognition error at: '_'
    

    To locate the faulty files, I ran this command to check all .avdl files recursively:

    #!/bin/bash
    
    find ./src/main/avro -name "*.avdl" | while read -r file; do
        echo "Checking: $file"
        java -jar ~/.m2/repository/org/apache/avro/avro-tools/1.12.0/avro-tools-1.12.0.jar idl2schemata "$file"
    done
    

    What Was Wrong in the .avdl File?

    The issue in my case was invalid enum naming conventions. Avro 1.12.0 enforces that:

    Example of a problematic .avdl enum:

    enum Status {
        _6_PENDING,        // ❌ Invalid (can not start with the '_')
        PENDING_6          // ✅ Valid
    }
    

    Takeaways