The default search behavior in CKAN is to only search full words. I'd like to enable partial string search. This kind of Google search behavior should apply to CKAN as well.
Example: my dataset titled Testdatensatz
should be found when searching for Test
, Testd
, datensa
and so on.
The default settings in CKAN allow to find Testdatensatz
only if the full word in entered as search term.
How to configure CKAN / SOLR for partial string matching?
Tried to add EdgeNGramFilterFactory in SOLR config but no success so far. Seems to be ignored.
<fieldType name="text" class="solr.TextField" omitNorms="false">
<analyzer type="index">
...
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="3" maxGramSize="15" side="front"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
...
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
I'm using CKAN 2.10, 2.11 and SOLR 9
EDIT:
I always rebuilt the search index using ckan search-index rebuild
Solution found:
I deploy using Docker. And I made the changes in the CKAN container and thus they had no effect.
Since SOLR runs in a separate Container the changes have to be made there.
So keep your Hands off this file ./ckan/config/solr/schema.xml
in CKAN Container and modify https://github.com/ckan/ckan-solr/blob/master/solr-9/Dockerfile instead. A good example for adding filters can be found in https://github.com/ckan/ckan-solr/blob/master/solr-9/Dockerfile.spatial
After the changes, don't forget to reindex in the CKAN container:
$ ckan search-index rebuild
and it works.
Now, the search string Test
finds the strings Testdaten
, Testdatensatz
, ...
Please note that search only finds matches from the first character on. So the search string est
does not find Testdaten
, Testdatensatz
, ... but that's ok.