I've setup a pretty simple mongo kafka source connector to stream mongo's oplog to kafka. However, I see that in the messages published by the connector, the serialized oplog events do not respect the extended JSON spec; for instance, a datetime field is represented as:
{"$date": 1597841586927}
When the spec says it should be formatted as:
{"$date": {"$numberLong": "1597841586927"}}
Why am I not getting clean extended JSON?
Note: my connector config file looks like this:
{
"name": "mongosource",
"config": {
"connector.class": "com.mongodb.kafka.connect.MongoSourceConnector",
"tasks.max": 1,
"connection.uri": "...",
"topic.prefix":"mongosource",
"database": "mydb",
"copy.existing": true,
"change.stream.full.document": "updateLookup",
}
}
The default json formatter of the source connector is a legacy one (see this issue on the connector's JIRA project).
From version 1.3.0
of this connector, there's a new config option that you can add to ask the connector to output proper extended JSON:
"output.json.formatter": "com.mongodb.kafka.connect.source.json.formatter.ExtendedJson"