pythonjsonelasticsearchstreamsets

geo_point mapping python and StreamSets fails with Elasticsearch


I have this mapping in elasticsearch

"mappings": {
          "properties": {
                "fromCoordinates": {"type": "geo_point"},
                "toCoordinates": {"type": "geo_point"},
                "seenCoordinates": {"type": "geo_point"},
            }
        }

With the kibana's console, there is no problem with all possible combinations of geo_ip fields supported by elasticsearch, i.e:

(lat, lon)

PUT /anindex/_doc/1
{
   "fromCoordinates": {
     "lat": 36.857200622558594    
     "lon": 117.21600341796875,

  },
  "toCoordinates": {
    "lat": 22.639299392700195    
    "lon": 113.81099700927734,

  },
  "seenCoordinates": {
     "lat": 36.91663    
     "lon": 117.216,
   }
}

(lon,lat)

PUT /anindex/_doc/2
{
 "fromCoordinates": [36.857200622558594, 117.21600341796875], 
 "toCoordinates": [22.639299392700195, 113.81099700927734], 
 "seenCoordinates": [36.91663, 117.216] 
}

But a I tried inserting, into elasticsearch, the data through python, and I always have this error:

RequestError(400, 'illegal_argument_exception', 'mapper [fromCoordinates] of different type, current_type [geo_point], merged_type [ObjectMapper]')

In python, I construct the json from a dictionary, and this is the result when I printed:

fromCoordinates = {}
fromCoordinates['lat'] = fromLat  
fromCoordinates['lon'] = fromLon 

dataDictionary.update({'fromCoordinates': fromCoordinates , 'toCoordinates': toCoordinates, 'seenCoordinates': seenCoordinates})
print(json.dumps(dataDictionary).encode('utf-8'))
{"fromCoordinates": {"lat": 43.9962005615, "lon": 125.684997559}, 
"toCoordinates": {"lat": 40.080101013183594, "lon": 116.58499908447266}, 
"seenCoordinates": {"lat": 33.62672, "lon": 109.37243}}

and load with this

data = json.dumps(dataDictionary).encode('utf-8')
es.create(index='anindex', doc_type='document', id=0, body=data)

The array version has the same problems:

fromCoordinates = [fromLon, fromLat]

This is the json created and printed in python:

{"fromCoordinates": [113.81099700927734, 22.639299392700195], 
  "toCoordinates": [106.8010025024414, 26.53849983215332], 
   "seenCoordinates": [107.46743, 26.34169]}

In this case I have this response

RequestError: RequestError(400, 'mapper_parsing_exception', 'geo_point expected')

The same error occurs if I try with StreamSets to elasticsearch, having the both types of json shown before:

mapper [fromCoordinates] of different type, current_type [geo_point], merged_type [ObjectMapper]

Any ideas?

UPDATE:

GET /anindex/_mapping
{ "anindex" : 
   { "mappings" : 
     { "properties" : 
       { "fromCoordinates" : 
          { "type" : "geo_point" }, 
        "toCoordinates" : 
           { "type" : "geo_point" }, 
        "seenCoordinates" : { "type" : "geo_point" } 
       }
      }
    }
 }

SOLUTION:

After the example given by @jzzfs I realized that the doc_type parameter in es.create(index='anindex', doc_type='document', id=0, body=data), is causing the error, I removed it, and it worked..... But I still wondering why in StreamSets has the same error... but I`ll continue with python.


Solution

  • If you were using an older version of elasticsearch (e.g. 6.1) and upgraded to a newer version (e.g. 7.X) - you need to remove doc_type on your indexing pattern as newer version do not accept this object anymore.

    old indexing pattern

    res=es_local.index(index='local-index',doc_type='resource', body=open_doc,id=_id,request_timeout=60)
    

    new indexing pattern

    res=es_local.index(index='local-index', body=open_doc,id=_id,request_timeout=60)
    

    Note:- no doc_type in new indexing pattern (assumes indexing using python).