javaamazon-web-servicessolramazon-cloudsearch

CloudSearch deleteByQuery


The official Solr Java API has a deleteByQuery operation where we can delete documents that satisfy a query. The AWS CloudSearch SDK doesn't seem to have matching functionality. Am I just not seeing the deleteByQuery equivalent, or is this something we'll need to roll our own?

Something like this:

SearchRequest searchRequest = new SearchRequest();
searchRequest.setQuery(queryString);
searchRequest.setReturn("id,version");
SearchResult searchResult = awsCloudSearch.search(searchRequest);
JSONArray docs = new JSONArray();
for (Hit hit : searchResult.getHits().getHit()) {
    JSONObject doc = new JSONObject();
    doc.put("id", hit.getId());
    // is version necessary?
    doc.put("version", hit.getFields().get("version").get(0));
    doc.put("type", "delete");
    docs.put(doc);
}
UploadDocumentsRequest uploadDocumentsRequest = new UploadDocumentsRequest();
StringInputStream documents = new StringInputStream(docs.toString());
uploadDocumentsRequest.setDocuments(documents);
UploadDocumentsResult uploadResult = awsCloudSearch.uploadDocuments(uploadDocumentsRequest);

Is this reasonable? Is there an easier way?


Solution

  • You're correct that CloudSearch doesn't have an equivalent to deleteByQuery. Your approach looks like the next best thing.

    And no, version is not necessary -- it was removed with the CloudSearch 01-01-2013 API (aka v2).