I need to write a Avro schema for the following data. The exposure is a array of arrays with 3 numbers.
{
"Response": {
"status": "",
"responseDetail": {
"request_id": "Z618978.R",
"exposure": [
[
372,
20000000.0,
31567227140.238808
]
[
373,
480000000.0,
96567227140.238808
]
[
374,
23300000.0,
251567627149.238808
]
],
"product": "ABC",
}
}
}
So I came up with a schema like the following:
{
"name": "Response",
"type":{
"name": "algoResponseType",
"type": "record",
"fields":
[
{"name": "status", "type": ["null","string"]},
{
"name": "responseDetail",
"type": {
"name": "responseDetailType",
"type": "record",
"fields":
[
{"name": "request_id", "type": "string"},
{
"name": "exposure",
"type": {
"type": "array",
"items":
{
"name": "single_exposure",
"type": {
"type": "array",
"items": "string"
}
}
}
},
{"name": "product", "type": ["null","string"]}
]
}
}
]
}
}
When I tried to register the schema. I got the following error. TypeError: unhashable type: 'dict' which means I used a list as a dictionary key.
Traceback (most recent call last):
File "sa_publisher_main4test.py", line 28, in <module>
schema_registry_client)
File "/usr/local/lib64/python3.6/site-packages/confluent_kafka/schema_registry/avro.py", line 175, in __init__
parsed_schema = parse_schema(schema_dict)
File "fastavro/_schema.pyx", line 71, in fastavro._schema.parse_schema
File "fastavro/_schema.pyx", line 204, in fastavro._schema._parse_schema
TypeError: unhashable type: 'dict'
Can anyone help point out what is causing the error?
There are a few issues.
First, at the very top level of your schema, you have the following:
{
"name": "Response",
"type": {...}
}
But this isn't right. The top level should be a record type with a field called Response
. So it should look like this:
{
"name": "Response",
"type": "record",
"fields": [
{
"name": "Response",
"type": {...}
}
]
}
The second problem is that for the array of arrays, you currently have the following:
{
"name":"exposure",
"type":{
"type":"array",
"items":{
"name":"single_exposure",
"type":{
"type":"array",
"items":"string"
}
}
}
}
But instead it should look like this:
{
"name":"exposure",
"type":{
"type":"array",
"items":{
"type":"array",
"items":"string"
}
}
}
After fixing those, the schema will be able to be parsed, but your data contains an array of array of floats and your schema says it should be an array of array of string. Therefore either the schema needs to be changed to float, or the data needs to be strings.
For reference, here's an example script that works after fixing those issues:
import fastavro
s = {
"name":"Response",
"type":"record",
"fields":[
{
"name":"Response",
"type": {
"name":"algoResponseType",
"type":"record",
"fields":[
{
"name":"status",
"type":[
"null",
"string"
]
},
{
"name":"responseDetail",
"type":{
"name":"responseDetailType",
"type":"record",
"fields":[
{
"name":"request_id",
"type":"string"
},
{
"name":"exposure",
"type":{
"type":"array",
"items":{
"type":"array",
"items":"string"
}
}
},
{
"name":"product",
"type":[
"null",
"string"
]
}
]
}
}
]
}
}
]
}
data = {
"Response":{
"status":"",
"responseDetail":{
"request_id":"Z618978.R",
"exposure":[
[
"372",
"20000000.0",
"31567227140.238808"
],
[
"373",
"480000000.0",
"96567227140.238808"
],
[
"374",
"23300000.0",
"251567627149.238808"
]
],
"product":"ABC"
}
}
}
parsed_schema = fastavro.parse_schema(s)
fastavro.validate(data, parsed_schema)