When I try to search for the text C&A, sphinx returned 0 results even though C&A is indexed in the search. It returns C&A when the letter 'C' is searched which means C&A is already indexed.
I think the problem is that Sphinx doesn't treat & as a word character, so it's treated as a word separator instead.
What I've tried so far
Used charsettable charset_table = 0..9, A..Z->a..z, _, a..z,U+410..U+42F->U+430..U+44F, U+430..U+44F,U+0026
Used the api escape string function $escaped = $cl->EscapeString ( "escaping-sample@query/string" );
Nothing seem to work. How do I change this behaviour in Sphinx?
Using Sphinx version: 2.0.4
After much effort of reading the sphinx documentation, I couldn't find any approach to solve this problem. Hence I went the php way. Here is what I did,
Used replace() in the sql index query to replace all special characters with their equivalent text.
Select id,Replace(Replace( Replace(name, '&', 'and'),' ','space'),'-','hyphen').....
From the user query, I replaced the characters accordingly with its equivalent text as in the sql.
//decode html encoding from input
$text = html_entity_decode($text);
// split and replace with &
if(strpos($text, '&'))
{
$array = explode("&",$text);
$text = $array[0]. "and". $array[1];
}
// split and replace with hyphen
if(strpos($text, '-'))
{
$array = explode("-",$text);
$text = $array[0]. "hyphen". $array[1];
}
// split and replace with space
if(strpos($text, ' '))
{
$array = explode(" ",$text);
$text = $array[0]. "space". $array[1];
}
Now, taking the ampersand example, when user queries for the text C&A, sphinx takes it as canda and returns the match C&A as expected.
Note: In my case, Sphinx has indexed all special characters, I only had the problem while querying.
EDIT: Updating Sphinx to latest version seems have solved this problem. Use blend_chars in your index conf.