I'm working on time series charts for 300+ clients. It is beneficial to us to pull each client separately as the combined data is huge and in some cases clients data is resampled or manipulated in a slightly different fashion.
My problem is that the function I loop through to get each client data opens 3 new threads but never closes the threads (I'm assuming the connection stays open) when the request is complete and the function returns the data.
Once I have the results of a client, I'd like to close that connection. I just can't figure out how to do that and haven't been able to find anything in my searches.
def solr_data_pull(submitterId):
zookeeper= pysolr.ZooKeeper('ndhhadr1dnp11,ndhhadr1dnp12,ndhhadr1dnp13:2181/solr')
solr = pysolr.SolrCloud(zookeeper, collection='tran_timings', timeout=60)
query = ('SubmitterId:'+ str(submitterId) +' AND Tier:'+tier+' AND Mode:'+mode+' '
'AND Timestamp:['+ str(start_period)+' TO '+ str(end_period)+ '] ')
results = solr.search(rows=50000, q=[query], fl=[fl_list])
return(pd.DataFrame(list(results)))
PySolr uses the Session
object from requests
as its underlying library (which in turn uses urllib3s connection pooling), so calling solr.get_session().close()
should close all connections and drain the pool:
def close(self):
"""Closes all adapters and as such the session"""
(SolrCloud
is an extension of Solr
which have the get_session()
method.)
For disconnecting from Zookeeper - which you probably shouldn't if its a long running session as it'll have to set up watches etc. again, you can use the .zk object directly on your SolrCloud
instance - zk is a KazooClient:
stop()
Gracefully stop this Zookeeper session.
close()
Free any resources held by the client.
This method should be called on a stopped client before
it is discarded. Not doing so may result in filehandles
being leaked.