avroavro-tools

org.apache.avro.AvroTypeException: Unknown union branch


I'm using this Avro schema:

prices-state.avsc

{
    "namespace": "com.company.model",
    "name": "Product",
    "type": "record",
    "fields": [
        {
            "name": "product_id",
            "type": "string"
        },
        {
            "name": "sale_prices",
            "type": {
                "name": "sale_prices",
                "type": "record",
                "fields": [
                    {
                        "name": "default",
                        "type": {
                            "name": "default",
                            "type": "record",
                            "fields": [
                                {
                                    "name": "order_by_item_price_by_item",
                                    "type": [
                                        "null",
                                        {
                                            "name": "markup_strategy",
                                            "type": "record",
                                            "fields": [
                                                {
                                                    "name": "type",
                                                    "type": {
                                                        "name": "type",
                                                        "type": "enum",
                                                        "symbols": ["margin", "sale_price"]
                                                    }
                                                }
                                            ]
                                        }
                                    ]
                                },
                                {"name": "order_by_item_price_by_weight", "type": ["null", "string"]},
                                {"name": "order_by_weight_price_by_weight", "type": ["null", "string"]}
                            ]
                        }
                    }
                ]
            }
        }
    ]
}

It validates properly on this website so I'm assuming the schema is valid.

I'm having issues building a JSON file that should then be encoded using the above schema.

I'm using this JSON for some testing:

test.json

{
    "product_id": "123",
    "sale_prices": {
        "default": {
            "order_by_item_price_by_item": {
                "markup_strategy": {
                    "type": {"enum": "margin"}
                }
            },
            "order_by_item_price_by_weight": null,
            "order_by_weight_price_by_weight": null
        }
    }
}

When running java -jar avro-tools-1.8.2.jar fromjson --schema-file prices-state.avsc test.json I get:

Exception in thread "main" org.apache.avro.AvroTypeException: Unknown union branch markup_strategy

I read here that I have to wrap things inside unions because of JSON Encoding so I tried different combinations but no one seemed to work.


Solution

  • It was a namespace resolution issue. Take this simplified schema as an example:

    test.avsc

    {
        "name": "Product",
        "type": "record",
        "fields": [
            {
                "name": "order_by_item_price_by_item",
                "type": [
                    "null",
                    {
                        "type": "record",
                        "name": "markup_strategy",
                        "fields": [{
                            "name": "type",
                            "type": {
                                "name": "type",
                                "type": "enum",
                                "symbols": ["margin", "sale_price"]
                            }
                        }]
                    }
                ]
            }
        ]
    }
    

    With the following JSON it validates just fine

    test.json

    {
        "order_by_item_price_by_item": {
            "markup_strategy": {
                "type": "margin"
            }
        }
    }
    

    Now if you were to add a namespace on top of your schema like

    test.avsc

    {
        "namespace": "test",
        "name": "Product",
        "type": "record",
        "fields": [
        ...
    

    Then you would need to change your test.json as well or you'll get

    Exception in thread "main" org.apache.avro.AvroTypeException: Unknown union branch markup_strategy

    final_test.json

    {
        "order_by_item_price_by_item": {
            "test.markup_strategy": {
                "type": "margin"
            }
        }
    }
    

    So when inside a union type and you're JSON encoding an Avro's named type (record, fixed or enum) where the user-specified name is used then the name of that type needs to be prepended with the namespace name too for resolution.

    More on namespaces and JSON encoding.