From what I understand, after reading documentation (especially scoring part), every field I add has the same level of importance when scoring searched results. I have following code:
protected static $_indexPath = 'tmp/search/indexes/projects';
public static function createSearchIndex()
{
$_index = new Zend_Search_Lucene(APPLICATION_PATH . self::$_indexPath, true);
$_projects_stmt = self::getProjectsStatement();
$_count = 0;
while ($row = $_projects_stmt->fetch()) {
$doc = new Zend_Search_Lucene_Document();
$doc->addField(Zend_Search_Lucene_Field::text('name', $row['name']));
$doc->addField(Zend_Search_Lucene_Field::text('description', $row['description']));
$doc->addField(Zend_Search_Lucene_Field::unIndexed('projectId', $row['id']));
$_index->addDocument($doc);
}
$_index->optimize();
$_index->commit();
}
The code is simple - I'm generating index, based on data fetched from db, and save it in the specified location.
I was looking in many places, as my desired behavior is that name
field is more important than description
(let's say 75% and 25%). So when I will search for some phrase, and it will be found in description of the first document, and in name of the second document, then second document will in fact have 3 times bigger score, and will show up higher on my list.
Is there any way to control scoring/ordering in this way?
I found it out basing on this documentation page. You need to create new Similarity
algorithm class, and overwrite lengthNorm
method. I copied this method from Default
class, added $multiplier
variable, and set it's value when needed (for a column I want):
class Zend_Search_Lucene_Search_Similarity_Projects extends Zend_Search_Lucene_Search_Similarity_Default
{
/**
* @param string $fieldName
* @param integer $numTerms
* @return float
*/
public function lengthNorm($fieldName, $numTerms)
{
if ($numTerms == 0) {
return 1E10;
}
$multiplier = 1;
if($fieldName == 'name') {
$multiplier = 3;
}
return 1.0/sqrt($numTerms / $multiplier);
}
}
Then the only thing you need to do (edit of code from question) is set your new Similarity
algorithm class as a default method just before indexing:
protected static $_indexPath = 'tmp/search/indexes/projects';
public static function createSearchIndex()
{
Zend_Search_Lucene_Search_Similarity::setDefault(new Zend_Search_Lucene_Search_Similarity_Projects());
$_index = new Zend_Search_Lucene(APPLICATION_PATH . self::$_indexPath, true);
$_projects_stmt = self::getProjectsStatement();
$_count = 0;
while ($row = $_projects_stmt->fetch()) {
$doc = new Zend_Search_Lucene_Document();
$doc->addField(Zend_Search_Lucene_Field::text('name', $row['name']));
$doc->addField(Zend_Search_Lucene_Field::text('description', $row['description']));
$doc->addField(Zend_Search_Lucene_Field::unIndexed('projectId', $row['id']));
$_index->addDocument($doc);
}
$_index->optimize();
$_index->commit();
}
I wanted to extra boost name
field, but you can do it with anyone.