sphinxsphinxqlwildcard-expansion

Wildcard searching between words with CRC mode in Sphinx


I use sphinx with CRC mode and min_infix_length = 1 and I want to use wildcard searching between character of a keyword. Assume I have some data like these in my index files:

name
-------
mickel
mick
mickol
mickil
micknil
nickol
nickal

and when I search for all record that their's name start with 'mick' and end with 'l':

select * from all where match ('mick*l')

I expect the results should be like this:

name
-------
mickel
mickol
mickil
micknil

but nothing returned. How can I do that?


Solution

  • You can't use 'middle' wildcards with CRC. One of the reaons for dict=keywords, the wildcards it can support are much more flexible.

    With CRC, it 'precomputes' all the wildcard combinations, and injects them as seperate keywords in index, eg for

    eg mickel as a document word, and with min_prefix_len=1, indexer willl create the words:

    mickel
    mickel*
    micke*
    mick*
    mic*
    mi*
    m*
    

    ... as words in index, so all the combinations can match. If using min_infix_len, it also has to do all the combinations at the start as well (so (word_length)^2 + 1 combinations)

    ... if it had to precompute all the combinations for wildcards in the middle, would be a lot more again. Particularly if then allows all for middle AND start/end combinations as well)


    Although having said that, you can rewrite

    select * from all where match ('mick*l')
    

    as

    select * from all where match ('mick* *l')
    

    because with min_infix_len, the start and end will be indexed as sperate words. Jus need to insist that both match. (although can't think how to make them bot match the same word!)