mysqljdbcelasticsearchelasticsearch-jdbc-river

Jdbc river stops on MapperParsingException


I am using Elastic search version 1.2.0, Jdbc river version 1.2.0.1.

Following is my Jdbc river command.

curl -XPUT 'localhost:9200/_river/tbl_messages/_meta' -d '{
    "type" : "jdbc",
    "jdbc" : {
    "strategy" : "simple",
    "url" : "jdbc:mysql://localhost:3306/messageDB",
    "user" : "username",
    "password" : "password",
    "sql" : "select messageAlias.id as _id,messageAlias.subject as subject from tbl_messages messageAlias",
    "index" : "MessageDb",
    "type" : "tbl_messages", 
    "maxbulkactions":1000,
    "maxconcurrentbulkactions" : 4,
    "autocommit" : true,
    "schedule" : "0 0-59 0-23 ? * *"    
    }
}'

Subject column's index meta data

subject: {
    type: string
}

This table has 2 Million records and subject field contains arbitrary strings. Some sample data are "You're invited ","{New York:45} We rock!!","{Invitation:27}" so on.

My problem is that when jdbc river encounters one such record with {anything inside of this}, It stalls the river and throws parsing exception. It never moves on to index next records.

org.elasticsearch.index.mapper.MapperParsingException: failed to parse [subject]
at org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:418)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeObject(ObjectMapper.java:537)
at org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:479)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:515)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:462)
at org.elasticsearch.index.shard.service.InternalIndexShard.prepareIndex(InternalIndexShard.java:394)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:413)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:155)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:534)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:433)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

Caused by: org.elasticsearch.ElasticsearchIllegalArgumentException: unknown property [Inivitation]
at org.elasticsearch.index.mapper.core.StringFieldMapper.parseCreateFieldForString(StringFieldMapper.java:332)
at org.elasticsearch.index.mapper.core.StringFieldMapper.parseCreateField(StringFieldMapper.java:278)
at org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:408)
... 12 more

Deleting this record in db,clearing data inside ES_HOME/data and recreating the river seems to be the only way to proceed until it encounter the above said formatted record again.

How do I make it to continue indexing irrespective of exception when parsing few records?


Solution

  • It is related to Elastic search and not the river.

    https://github.com/jprante/elasticsearch-river-jdbc/issues/258

    https://github.com/elasticsearch/elasticsearch/issues/2898