pythonelasticsearchpyelasticsearcheland

How to search exact index (not index pattern) in elasticsearch eland - updated?


I am calling elasticsearch data using eland. The documentation is simple and I am able to implement it, but when searching the index it searches the index string using es_index_pattern which is basically a wildcard (it is also stated in the documentation).

from elasticsearch import ElasticSearch
import eland as ed

es = Elasticsearch(hosts="myhost", "port":0000)

search_body={
    "bool":{
            "filter":[
                {"exists": {"field": "customer_name"}},
                {"match_phrase": {"city": "chicago"}},
                ]
        }

    }

# Success : I am able to get the results if I search the index through "elasticsearch" api. Tried this repetitively and it works every time
results = es.search(index="my_index", body=search_body)

# Failure : But, I do not get results (but ReadTimeoutError) if I connect to 'my_index' index via the same localhost Elasticsearch using Eland
df = ed.DataFrame(es_client=es, es_index_pattern = 'my_index')

I have to hand type the error message becasue I cannot copy the error outside the environment I am using. Also, my host and port would be different

...
  File ".../elasticsearch/transport.py", line 458, in perform_request
    raise e
  File "......elasticsearch/transport.py", line 419, in perform_request
  File "..... /elasticsearch/connection/http_urllib3.py", line 275, in perform_request
    raise ConnectionTimeout("TIMEOUT", str(e), e)
elasticsearch.exceptions.ConnectionTimeout: ConnctionTimeout caused by - ReadTimeoutError(HTTPSConnectionPool(host=myhost', port=0000): Read timed out. (read timeout=10)

I think that search through elasticsearch is able to get results bc it's calling the exact index name and hence not running into timedout.

But, Eland is rather using es_index_pattern thereby using my_index as wildcard i.e *my_index*, therefore I must be running into ReadTimeOutError.

I looked inside the source code to see if there was anything I could do, so Eland did not search the index as a pattern but exact match. But, I see no option for searching the exact index both in the documentation and the source code.

How do I search for exact index string in Eland?

Sources:


Solution

  • Also posted this on Github but I'll replicate here:

    Searching an exact index only requires passing the exact index name, no wildcards are used:

    import eland as ed
    from elasticsearch import Elasticsearch
    
    client = Elasticsearch(...)
    
    client.index(index="test", document={"should": "seethis"})
    client.index(index="test1", document={"should": "notseethis"})
    client.index(index="1test", document={"should": "notseethis"})
    client.indices.refresh(index="*test*")
    
    df = ed.DataFrame(client, es_index_pattern="test")
    print(df.to_pandas())
    

    The output of the above is this as expected:

                           should
    SNTTnH4BRC8cqQQMds-V  seethis
    

    The pattern word in the option doesn't mean we're using wildcards, it's the pattern that we're sending to Elasticsearch in the search and index APIs.