I have configured a multicore solr cloud. Created a collection with 2 shrads and no replication.
Through the solr UI 192.168.1.56:8983, I am able to get results to the query.
I want to do the same with pysolr, so tried running following:
import pysolr
zookeeper = pysolr.ZooKeeper("192.168.1.56:2181,192.168.1.55:2182")
solr = pysolr.SolrCloud(zookeeper, "random_collection")
the last line is not able to find the collection even though its there. Below is a error trace:
---------------------------------------------------------------------------
SolrError Traceback (most recent call last)
<ipython-input-43-9f03eca3b645> in <module>()
----> 1 solr = pysolr.SolrCloud(zookeeper, "patent_colllection")
/usr/local/lib/python2.7/dist-packages/pysolr.pyc in __init__(self, zookeeper, collection, decoder, timeout, retry_timeout, *args, **kwargs)
1176
1177 def __init__(self, zookeeper, collection, decoder=None, timeout=60, retry_timeout=0.2, *args, **kwargs):
-> 1178 url = zookeeper.getRandomURL(collection)
1179
1180 super(SolrCloud, self).__init__(url, decoder=decoder, timeout=timeout, *args, **kwargs)
/usr/local/lib/python2.7/dist-packages/pysolr.pyc in getRandomURL(self, collname, only_leader)
1315
1316 def getRandomURL(self, collname, only_leader=False):
-> 1317 hosts = self.getHosts(collname, only_leader=only_leader)
1318 if not hosts:
1319 raise SolrError('ZooKeeper returned no active shards!')
/usr/local/lib/python2.7/dist-packages/pysolr.pyc in getHosts(self, collname, only_leader, seen_aliases)
1281 hosts = []
1282 if collname not in self.collections:
-> 1283 raise SolrError("Unknown collection: %s", collname)
1284 collection = self.collections[collname]
1285 shards = collection[ZooKeeper.SHARDS]
SolrError: (u'Unknown collection: %s', 'random_collection')
Solr version is 6.6.2 and zookeeper version is 3.4.10
How to create a connection to solr cloud collection?
Pysolr currently does not support external zookeeper cluster. Pysolr checks for collections in clusterstate.json which Solr has improvised to state.json for each cluster, and clusterstate.json is kept empty.
To solve your problem for single collection you can hard-code ZooKeeper.CLUSTER_STATE variable in pysolr.py as follows:
ZooKeeper.CLUSTER_STATE = '/collections/random_collection/state.json'
pysolr.py could be found at /usr/local/lib/python2.7/dist-packages and maybe try reinstalling it with
pip install -e /usr/local/lib/python2.7/dist-packages/pysolr.py