I see that the following API will do delete by query in Elasticsearch - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-delete-by-query.html
But I want to do the same with the elastic search bulk API, even though I could use bulk to upload docs using
es.bulk(body=json_batch)
I am not sure how to invoke delete by query using the python bulk API for Elastic search.
Seeing as how elasticsearch has deprecated the delete by query API. I created this python script using the bindings to do the same thing. First thing define an ES connection:
import elasticsearch
es = elasticsearch.Elasticsearch(['localhost'])
Now you can use that to create a query for results you want to delete.
search=es.search(
q='The Query to ES.',
index="*logstash-*",
size=10,
search_type="scan",
scroll='5m',
)
Now you can scroll that query in a loop. Generate our request while we do it.
while True:
try:
# Git the next page of results.
scroll=es.scroll( scroll_id=search['_scroll_id'], scroll='5m', )
# Since scroll throws an error catch it and break the loop.
except elasticsearch.exceptions.NotFoundError:
break
# We have results initialize the bulk variable.
bulk = ""
for result in scroll['hits']['hits']:
bulk = bulk + '{ "delete" : { "_index" : "' + str(result['_index']) + '", "_type" : "' + str(result['_type']) + '", "_id" : "' + str(result['_id']) + '" } }\n'
# Finally do the deleting.
es.bulk( body=bulk )
To use the bulk api you need to ensure two things: