I am having a very bad week having chosen elasticsearch with graylog2. I am trying to run queries against the data in ES using Python.
I have tried following clients.
Elasticutils - Another documented, but without a complete sample. I get the following error with code attached. I don't even know how it uses this S() to connect to the right host?
es = get_es(hosts=HOST, default_indexes=[INDEX])
basic_s = S().indexes(INDEX).doctypes(DOCTYPE).values_dict()
results:
print basic_s.query(message__text="login/delete")
File "/usr/lib/python2.7/site-packages/elasticutils/__init__.py", line 223, in __repr__
data = list(self)[:REPR_OUTPUT_SIZE + 1]
File "/usr/lib/python2.7/site-packages/elasticutils/__init__.py", line 623, in __iter__
return iter(self._do_search())
File "/usr/lib/python2.7/site-packages/elasticutils/__init__.py", line 573, in _do_search
hits = self.raw()
File "/usr/lib/python2.7/site-packages/elasticutils/__init__.py", line 615, in raw
hits = es.search(qs, self.get_indexes(), self.get_doctypes())
File "/usr/lib/python2.7/site-packages/pyes/es.py", line 841, in search
return self._query_call("_search", body, indexes, doc_types, **query_params)
File "/usr/lib/python2.7/site-packages/pyes/es.py", line 251, in _query_call
response = self._send_request('GET', path, body, querystring_args)
File "/usr/lib/python2.7/site-packages/pyes/es.py", line 208, in _send_request
response = self.connection.execute(request)
File "/usr/lib/python2.7/site-packages/pyes/connection_http.py", line 167, in _client_call
return getattr(conn.client, attr)(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/pyes/connection_http.py", line 59, in execute
response = self.client.urlopen(Method._VALUES_TO_NAMES[request.method], uri, body=request.body, headers=request.headers)
File "/usr/lib/python2.7/site-packages/pyes/urllib3/connectionpool.py", line 294, in urlopen
return self.urlopen(method, url, body, headers, retries-1, redirect) # Try again
File "/usr/lib/python2.7/site-packages/pyes/urllib3/connectionpool.py", line 294, in urlopen
return self.urlopen(method, url, body, headers, retries-1, redirect) # Try again
File "/usr/lib/python2.7/site-packages/pyes/urllib3/connectionpool.py", line 294, in urlopen
return self.urlopen(method, url, body, headers, retries-1, redirect) # Try again
File "/usr/lib/python2.7/site-packages/pyes/urllib3/connectionpool.py", line 294, in urlopen
return self.urlopen(method, url, body, headers, retries-1, redirect) # Try again
File "/usr/lib/python2.7/site-packages/pyes/urllib3/connectionpool.py", line 255, in urlopen
raise MaxRetryError("Max retries exceeded for url: %s" % url)
pyes.urllib3.connectionpool.MaxRetryError: Max retries exceeded for url: /graylog2/message/_search
I wish the devs of this good projects would provide some complete examples. Even looking at sources I am t a complete loss.
Is there any solution, help out there for me with elasticsearch and python or should I just drop all of this and pay for a nice splunk account and end this misery.
I am proceeding with using curl, download the entire json result and json load it. Hope that works, though curl downloading 1 million messages from elasticsearch may not just happen.
Honestly, I've had the most luck with just CURLing everything. ES has so many different methods, filters, and queries that various "wrappers" have a hard time recreating all the functionality. In my view, it is similar to using an ORM for databases...what you gain in ease of use you lose in flexibility/raw power.
Except most of the wrappers for ES aren't really that easy to use.
I'd give CURL a try for a while and see how that treats you. You can use external JSON formatters to check your JSON, the mailing list to look for examples and the docs are ok if you use JSON.