amazon-web-serviceselasticsearchauthorizationopensearchamazon-opensearch

Clear scroll in OpenSearch/ElasticSearch with read only permission


I want to fetch documents form an opensearch index with read_only permissions using the Scroll API. I tried these permissions for my role

indices:data/read/scroll/clear
indices:data/read/scroll
read

and

read

But when I run (using the python sdk):

self.client.clear_scroll(scroll_id=scroll_ids_str)

when the opensearch package tries to run

return self.transport.perform_request(
            "DELETE", "/_search/scroll", params=params, headers=headers, body=body
        )

I get this autorization error as a warning:

AuthorizationException(403, 'security_exception', {'error': {'root_cause': [
  {'type': 'security_exception', 'reason': 'no permissions for [indices:data/read/scroll/clear] and User [name=arn:aws:iam::<AWSID>:user/<NAME>, backend_roles=[], requestedTenant=null]'}], 
 'type': 'security_exception', 'reason': 'no permissions for [indices:data/read/scroll/clear] and User [name=arn:aws:iam::<AWSID>:user/<NAME>, backend_roles=[], requestedTenant=null]'}, 'status': 403})

Note that I explicitly added the suggested permission to my role. Also, I can successfully fetch all the desired documents, but I do not like that the warning suggests that I am not cleaning up some resources.

QUESTION: How do I successfully delete the scroll context with read only permissions?

Version: opensearch-py==2.2.0


Solution

  • Scrolling is good because there is a 10k document-retrieval limit otherwise that can only be up-configured server-side.

    But OS/ES permissions are a bit fiddly to get right, as I'm sure you know already. The opensearch python documentation is also slightly disparate.

    For reference there's a complete list here: https://opensearch.org/docs/latest/security/access-control/permissions

    In your specific case I came across the following helpful thread:

    https://forum.search-guard.com/t/query-regarding-scroll-and-clear-permission/2026

    So you should be able to add

    indices:data/read/scroll/clear
    

    to your cluster-level permissions. I think that's because in REST speak the DELETE operation has the same citizenship as GET, POST etc. so requires a bit more attention. You can test this out in the OS console sandbox but I can see why you might be nervous to test out DELETE operations there.

    Don't forget that in general index permissions usually need some (parent) cluster permissions i.e. one higher.

    Also in general OS/ES permissions are very restrictive by default (but include RBAC and multi-tenancy), unless you start putting * everywhere. For the sake of others this is another hierarchical example where index-level permissions need to be activated by specifying to which indices/indexes they apply. The granularity even extends to the document field level. The corollary is the need for specificity. Don't be tempted to start putting * everywhere! :)

    Have fun and good luck!